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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

IMMUNOGENIC COMPOSITIONS FOR STREPTOCOCCUS PYOGENES 
This application incorporates by reference in its entirety U.S. provisional patent application 
No. 60/491,822, filed on July 3 1,2003. ' 

■ 

TECHNICAL FIELD 

This invention is in the fields of immunology and vaccinology. In particular, it relates to 
antigens derived from Streptococcus pyogenes and their use in immunisation. All documents cited 
herein are incorporated by reference in their entirety. 

■ 

BACKGROUND ART 

Group A streptococcus ("GAS", S.pyogenes) is a frequent human pathogen, estimated to 
be present in between 5-15% of normal individuals without signs of disease. . When host defences 
are compromised, or when the organism is able to exert its virulence, or when it is introduced to 
vulnerable tissues or hosts, however, an acute infection occurs. Related diseases include 
puerperal fever, scarlet fever, erysipelas, pharyngitis, impetigo, necrotising fesciitis, myositis and 
streptococcal toxic shock syndrome. 

» 

Although S.pyogenes may be treated using antibiotics, a prophylactic vaccine to prevent 
the onset of disease is desired. Efforts to develop such a vaccine have been ongoing for many 
decades. While various GAS vaccine approaches have been suggested and some approaches are 
currently in clinical trials, to date, there are no GAS vaccines available to the public. 

It is an object of the invention to provide further and improved compositions for providing 
immunity against GAS disease and/or infection. The compositions are based on a combination of two 
or more (eg. three or more) GAS antigens. . 

DISCLOSURE OF THE INVENTION 

Applicants have discovered a group of thirty GAS antigens that are particularly suitable for 
immunisation purposes, particularly when used in combinations. In addition, Applicants have 
. identified a GAS antigen (GAS 40) which is particularly immunogenic used either alone or in 
combinations with additional GAS antigens. 

The invention therefore provides an immunogenic composition comprising GAS 40, a 
fragment thereof or a polypeptide having sequence identity thereto. The invention further includes an 
immunogenic composition comprising a combination of GAS antigens, said combination consisting 
of two to ten GAS antigens, wherein said combination includes GAS 40 or a fragment thereof or a 
polypeptide having sequence identity thereto. Preferably, the combination consists of three, four, 
five, six, or seven GAS antigens. Still more preferably, the combination consists of three, four, or five 
GAS antigens. 

m 

m 

The invention also provides an immunogenic composition comprising a combination of GAS 
antigens, said combination consisting of two to thirty-one GAS antigens of a first antigen group, said 
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first antigen group consisting of: GAS 1 17, GAS 130, GAS 277, GAS 236, GAS 40, GAS 389, GAS 
504, GAS 509, GAS 366, GAS 159, GAS 21 7, GAS 309, GAS 372, GAS 039, GAS 042, GAS 058, 
GAS 290, GAS 51 1, GAS 533, GAS 527, GAS 294, GAS 253, GAS 529, GAS 045, GAS 095, GAS 
193, GAS 137, GAS 084, GAS 384, GAS 202, and GAS 057. These antigens are referred to herein as 
5 the 'first antigen groiip'. Preferably, the combination of GAS antigens consists of three, four, five, 
six, seven, eight, nine, or ten GAS antigens selected from the first antigen group. Preferably, the 
combination of GAS antigens consists of three, four, or five GAS antigens selected from the first 
antigen group. 

GAS 39, GAS 40, GAS 57, GAS 1 1 7, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 
10 5 1 1 are particularly preferred GAS antigens. Preferably, the combination of GAS antigens includes 

• ■ 

either or both of GAS 40 and GAS 117. Preferably, the combination includes GAS 40. 

« 

■ 

Representative examples of some of these antigen combinations are discussed below. 

The combination of GAS antigens may consist of three GAS antigens selected from the first 
antigen group. Accordingly, in one embodiment, the combination of GAS antigens consists of GAS 
1 5 40, GAS 1 1 7 and a third GAS antigen selected from the first antigen group. Preferred combinations 
include GAS 40, GAS 1 17 arid a third GAS antigen selected from the group consisting of GAS 39, 
GAS 57, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 511. 

In another embodiment, the combination of GAS antigens consists of GAS 40 and two 
additional GAS antigens selected from the first antigen group. Preferred combinations include GAS 
20 40 and two GAS antigens selected from the group consisting of GAS 39, GAS 57, GAS 1 17, GAS 
202, GAS 294, GAS 527, GAS 533, and GAS 51 1. In another embodiment, the combination of GAS 
antigens consists of GAS 1 17 and two additional GAS antigens selected -from the first antigen group. 

The combination of GAS antigens may consist of four GAS antigens selected from the first 
antigen group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 1 7 
25 and two additional GAS antigens selected from the first antigen group. Preferred combinations 
include GAS 40, GAS 1 1 7, and two GAS antigens selected from the group consisting of GAS 39, 
GAS 57, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 511. 

In another embodiment, the combination of GAS antigens consists of GAS 40 and three 
additional GAS antigens selected from the first antigen group. Preferred combinations include GAS 
30 40 and three additional GAS antigens selected from the group consisting of GAS 39, GAS 57, GAS 
117, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 51 i . In one embodiment, the combination of 
GAS antigens consists of GAS 1 17 and three additional antigens selected from the first antigen group. 

The combination of GAS antigens may consist of five GAS antigens selected from the first 
antigen group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 117 
35 and three additional GAS antigens selected from the first antigen group. Preferred combinations 
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include GAS 40, GAS 1 17 and three additional GAS antigens selected from the group consisting of 
GAS 39, GAS 57, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 511. 

In another embodiment, the combination of GAS antigens consists of GAS 40 and four 

* 

additional GAS antigens selected from the first antigen group. Preferred combinations include GAS 
5 40 and four additional GAS antigens selected from the group consisting of GAS 39, GAS 57, GAS 
1 1 7, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 5 1 1 . In one embodiment, the combination of 
GAS antigens consists of GAS 1 17 and four additional GAS antigens selected from the first antigen 
group. 

The combination of GAS antigens may consist of eight GAS antigens selected from the first 

• * 

10 antigen group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 17 
and six additional GAS antigens selected from the first antigen group. In one embodiment, the 
combination of GAS antigens consists of GAS 40 and seven additional GAS antigens selected frojn 
the first antigen group. In one embodiment, the combination of GAS antigens consists of GAS 117 
and seven additional GAS antigens selected from the first antigen group. 

1 5 The combination of GAS antigens may consist of ten GAS antigens selected from the first 

antigen group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 117 
and eight additional GAS antigens selected from the first antigen group. In one embodiment, the 
combination of GAS antigens consists of GAS 40 and nine additional GAS antigens selected from the 
first antigen group. In one embodiment, the combination of GAS antigens consists of GAS 1 17 and 

20 nine additional GAS antigens selected from the first antigen group. 

Each of the GAS antigens of the first antigen group are described in more detail below. 
Genomic sequences of at least three GAS strains are publicly available. The genomic sequence of an 
Ml GAS strain is reported at Ref. 1 . The genomic sequence of an NO GAS strain is reported at Ref. 
2. The genomic sequence of an Ml 8 GAS strain is reported at Ref. 3. Preferably, the GAS antigens 

* 

25 of the invention comprise polynucleotide or amino acid sequence of an Ml, M3 or M18 GAS strains. 
More preferably, the GAS antigens of the invention comprise a polynucleotide or amino acid 
sequence of an Ml strain. 

(1) GAS 117 

GAS 1 17 corresponds to Ml GenBank accession numbers GL13621679 and GI: 15674571, to M3 
30 GenBank accession number GI:21909852, to M18 GenBank accession number GI: 19745578, and is 
also referred to as 'SpyrtMS* (Ml), *SpyM3 JJ316' (M3), and 4 SpyM18J)49r (M18). Examples of 
amino acid and polynucleotide sequences of GAS 1 17 of an Ml strain are set forth below: 

SEQ ID NO: 1 

MTLKKHYYLLSLLAL VTV1GAA PNTSQS VS AQVYSNEGYHQH KDNAQLOLRN I LDG YOND 

35 LGlOfy'SSYYYYNIJlTWGLSSEQDIEKHYEELKNKLHDMYNHY 

SEQ ID NO: 2 

ATGACACTAAAAAAACACTATTATCTTCTCAGCCTGCTAGCTCTTGTAACGGTTGGTGCTGCCTTTAACA 
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CAAGCCAGAGTGTCACnXSCACAAGTTT 

AGACCTGCAATATAGTAAAGACAACGCACAACTTCAATTGAGAAAT^ 
CTAGGGAGACACTACTCTAGCTATTATTACTACAACCTAAGAACCGTTATGGGACT 
ACATTGAAAAACACTATGAAGAGCTTAAGAACAAGTTACAT^ : 

Preferred GAS 1 17 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1 ; and/or (b) which is a fiagment of at least n 
consecutive amino acids of SEQ ID NO: 1 , wherein n is 7 or more (e.g. 8, 10, 12, 14, 1 6, 1 8, 20, 25, 

10 30,35,40,50,60,70,80,90, 100 or more). These GAS 117 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: I. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 1 . Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 1 . For 

1 5 example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 1 
is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide; of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(2) GAS 130 

■ 

' GAS 130 corresponds to Ml GenBank accession numbers GI: 1362 1794 and GI: 15674677, to M3 
20 GenBank accession number GI: 2 1 909954, to Ml 8 GenBank accession number GI: 1 9745704, and is 
also referred to as , Spy0591 ' (Ml), 4 SpyM3 JW18' (M3), and 'SpyM18_0660' (M18). GAS 130 has 
• potentially been identified as a putative protease. Examples of amino acid and polynucleotide 
sequences of GAS 130 of an Ml strain are set forth below: 

■ 

SEQ ID NO: 3 

25 MS HMKKR PEVLS PAGTLE KL KVAI D YGADAVFVGGQAYGLRS RAGN FgMEB LQEGI D Y AHARGAKVYVAA 

NMVTHEGNE I GAGEWFRQLRDMGLDAVI VSDPALI VICSTEAPGLE I HLSTQASSTNYETFEFWKAMGLT 

RVVLAREVNMAELAEIRKRTDVEI EAPVHGAMCI SYSGRCVLSNHMSHRDANRGGCSQSCRWKYDLYDMP 

FGGERRSLKGEI PEDYSMSSVDMCMIDHI PDLI ENGVDS LKI EGRMKSIHYVSTVTNCYKAAVGAYMES P 

EAFYAIKEELIDELWKVAQRELATGFYYGI PTENEQLFGARRKI PQYKFVGEWAFDSASMTATIRQRNV 

30 IMEGDRIECYGPGFRHFETNAnCDLHDADGQKIDRAPNPMELLTISLPREVKPGDMIRACKEGLVNLYQ 
GTSKTVRT 

SEQ ID NO: 4 

ATGTCACATATGAAAAAACGTCCCGAGGTCTTATCACCT 
35 TTGACTATGGCGCAGATGCTGTTTTTGTTGGAGGGCAGG 

CTCTATGGAAGAATTGCAAGAAGGCATTGATTATGCAC^TGC 
AACATGGTTACCC^CGAAGGGAACGAAATTGGTGCGGGCGAGT 
TTGATGCGGTCATTGTTTCAGATCCAGCCITGATTGTTATTT 
TCATTTGTC^CGCAAGCITCATCTACCAATTATOAGAC 

40 CXaAGTTGTTTTAGCTCGCGAGGTTAATATGGCCGAGTTAGCAGAAATCCG 
TTGAAGCCTTTGTCCATGGAGCCATGTGTATCT 
TOVCCGTGATGCCAACAGGGGCGGCTGCTCACAGTCTTG 

TTTGGAGGAGAGCGCCGCTCpTTAAAAGGGGAAATTCXIAGAAGACT 
GTATGATTGACCATATTCCTGACCI^ 

45 ATCTATCCACTACGTCTCAACCGTAACCAACrcTTACAAGGCGGCrc 
GAAGCTTTTTATGCTATCAAAGAGGAATTGATTGACGAGTTC 
CAGGTTTTTACTATGGTATCCCAACTGAAAATGAACAATTATTTGGTC 
TAAATTTGTCGGAGAAGTAGTTGCCTTTGACTCAGCTAGCATG^ 

ATC^TGGAAGGCGATCGGATTGTyVTGTTATGGACGAGGTTTCCGTC^ . 
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TACATGATGOK^TGGCCAAAAGATTGACCGTGCC^ 

GAGAGAAGTTAAGCCAGGGGATATGATTAGGGCTTCCAAGGAAGGTCTGG 
GGCACCAGTAAAACTGTTAGAACATAG 

5 . Preferred GAS 130 proteins for use >yith the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

* 

97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 3; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 3, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, ■ 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). Tliese GAS 1 30 proteins include variants (eg. 

10 allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 3. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 3 . Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 3. Other fragments Omit one or more domains of the protein (eg. omission of a signal peptide, of 

15 a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(3) GAS 277 

GAS 277 corresponds to Ml GenBank accession numbers GL13622962 and GI:1567574l to M3 
GenBank accession number GI: 2191 1206, to M18 GenBank accession number GI: 19746852, and is 
also referred to as 'Spyl939* (Ml), ( SpyM3 J670* (M3), and 'SpyMlSJOOe* (M18). Amino acid 
20 and polynucleotide sequences of GAS 277 of an M 1 strain are set. forth below: 

SEQ ID NO: 5 

M TTMQKTI SLLSIALLIGLLGTSGKAISVYA 0D0HTDNVIAESTISQV5VRA nATVTTn^p 
WQPTQATITLKDASPOTINSmnfT^ 

QNKARKTPT^QQKDTSKAMTNSVTDVT>TKAQTNQSANQEIDSTSNPFRS 
25 ASNSQKNGSNKTKMLVDKEBVKPTSKRGPPWVLLGLVVSLAAGLFIAIQKVSRRK 

SEQ ID NO: 6 

ATGACAACTATGCAAAAAACAATTAGCTTATT 
GCAAAGCCATATCTCTGTATGCACAAGATCAGCACACTGA 
30 GGTCAGTGTTGAAGTCCAGTATGCGTGGAACAGAACCTT 
• GTCAGACAACCAACTCAGGCAACGATAACACTTAA^ 
ATACTATGGCAGCGCAACAGCGTCGTTTTACAGCTO 

TCATGTAACTGTCACCGTTCATACTCAAGAAAAGGCAGTAACTGGTCAATCAG 
CAAAACAAAGCTAGAAAAACACCAACTAATATGCAACAAAAGGATACnT 
35 TCGATGTAGACACAAAAGCTCAAACAAATCAATCAGCTAACCAAGAAA 
CAGATCAGCTACTAATCATCGATCAACTTCCTTAAAGCGA 

GCTAGTAATAGCCAAAAAAACGGTAGCAACAAGACAAAAATGCTAGTGGA 

CTTGAAAAAGAGGATTC^CTTGGGTCTTATTA 

TATTCAAAAAGTATCTAGACGAAAATAA 

40 

Preferred GAS 277 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 5; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ED NO: 5, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
45 30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 277 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 5. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 5. Other preferred fragments lack one or more amino acids 
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(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 5. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 5 
is removed. Other fragments omit one or more domains of the protein (e.g. omission of a signal 
5 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(4) GAS 236 

GAS 236 corresppnds to Ml GenBank accession numbers GI:13622264 and GM5675106, M3 
GenBank accession number GI: 21910321, and to M18 GenBank accession number GI: 19746075, 
and is also referred to as ^1126* (Ml), 'SpyM3_0785* (M3), and 'SpyM18_1087' (M18). Amino 
1 0 acid and polynucleotide sequences of GAS 236 from an Ml strain are set forth below: 

. ■ 

SEQ m NO: 7 

MTQMNYTGKyKRVAIIj^GKYQSKRVASKL 
DKVRFVGIHTGHLGFYTDYRDFE^KLI 

KTMVADVI INHVKFESFRGDGI SVSTPTGSTAYNKSLGGAVLHPTI BALQLTE I SSLNNRVFRTLGSS 1 1 
1 5 I PKKDKI BLVPKRLGI YTI S IDNKTYQLKNVTKVEY FIDDEKIHFVSSPSHTS FWERVKDAFIGE IDS 

SEQ ID NO: 8 

ATGACACAGATG AATTATA(^GGTAAGGTAAAACGAGTTGCTATTATTGCAAA 
AAGGCGTCGCCTCCAAACTTTTCTCCGTATTTAAAGATGATCCTGATTTCTATCTTTCAAAGA 
20 GGATATTGTGATTTCTATT^ 

GATAAGGTACGTTTTGTAGGAATCCACA^ 

TTGATAAATTAATTGATAATTTAAGAAAAGACAAGGGAGAACAAATCTCTTATCCGACT 
TATTACTTTAGATGATGGTCGTGTGGTTAAAGCGCGTGCTTTGAATGAA 
AAAACGATGGTAGCAGATGTTATTATTAACCATGTCAAATTTGAAAGCTTCCGAGG 
25 TATCGACCCCGACAGGGAGCACAGCCTACAATAAATCTTTAGGTGGTGCTGTCTTGCATC 
AGCGCTGCAATTGACGGAAATTTCCAGTCTTAATAACCGTGTCTT^ 
ATTCCCAAAAAAGATAAGATTGAGTTAGTGCCAAAACGATTAGGAATTTATACCATTT 
AAACCTATCAGTTAAAAAATGTGACGAACSGTGGAGTATTTTATCGACGATG^ 

CTCTCCGAGTCATACGAGCTTTTGGGAAAGGGTCAAGGATGCCTTTATTGGAGAGA . 

30 

Preferred GAS 236 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (&g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 7; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 7, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 

• * » 

35 30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50 or more). These GAS 236 proteins include variants (eg. allelic 

■ 

variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 7. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 7. Other preferred fragments lack one or more amino acids 
(e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 7. For 
40 example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 7 
is removed. Other fragments omit one or more domains of the protein (e.g. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(5) GAS 040 

GAS 040 corresponds to Ml GenBank accession numbers GI: 1 362 1 545 and GI: 1 5674449, to M3 
45 . GenBank accession number GI: 21909733, to M18 GenBank accession number GI: 19745402, and is 
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also referred to as '.Spy0269' (Mi), 'SpyM3_0197' (M3), 'SpyMlSJESe' (M18) and 'prgA\ GAS 
040 has also been identified as a putative surface exclusion protein. Amino acid and polynucleotide 
sequences of GAS 040 from an Ml strain are set forth below: 

SEQ m NO: 9 

5 MDLEQTKPNQWCQKIALTSTIALLSA SVGVSHQ 

EKTLSQQKAELTELATALTKTTAE I NHLKEQQDNEQKALTSAQB I YTNTLAS SEETLLAQGAEHQRBLTA 
TETELHNAQADQHSKETALSEQKAS I S AETTRAQDLVEQVKTSEQN I AKLNAMI SN PDA I TKAAQTANDN 
TKALSSELEKAKADLENQKAKVKKQLTEELAAQKAALABKEAEL^ 

PLEELKKLEASGYIGS ASYNNYYKEHADQI I AKAS PGNQLNQ YQDI PADRNRFVDPDNLTPEVQNELAQF 
10 AAHMINSVRRQLGLPPVTVTAGSQBFARLLSTSYKKTHGOT^ 

GASGLIRNDDNMYENIGAFNDVHTVNGI KRGI YDSI KYMLFTDHLHGNTYGHAINFLRVDKHNPNAPVYIj 
GFSTSNVGSLNEHFWPESNIANHQRFNKTPIKAVGSTK^ . 
HQEADIMAAQAKVSQI^KLASTLKQSDSLNLQV^ 

S L KAALHQTEAIAEQAAAR VTALVAKKAHLQ YLRD F KLN PNRLQVI RER I DNTKQDLAKTTS SLLNAQEA 
1 5 LAAI^AKQSSLEATI ATTEHQLTLLKTLANEKEYRHLDEDI ATVPDLQVAPPLTGVKPLSYSKIDTTPLV 

QEMVKETKQLLEASARLAAENTSLVAEALVGQTSEMVASNAIVSKITSSITQPSSKTSYGSGSSTTSNLI 
SDVDESTQR ALKAGWMLAAVGLTGPRFRKE SK 

SEQffiNO:10 

20 atggacttagaacaaacgaagccaaaccaagttaagcagaaaattgctt^ 

tcagtc^gtgtaggcgtatctcaccaagtcaaagcagatgataga 

taatactcacgacgatagtttaccaaaaccagaaacaattcaagaggcaaaggcaactatt^ 

gaaaaaactctcagtcaacaaaaagcagaactgacagagcttgctaccxsctc^ 

aaatcaaccacttaaaac^gcagcaagataatgaacaaaaagctttaacctctc 
25 taatactcttgcaagtagtgaggagacgctattagcccaaggagccgaacatcaaaga 

actgaaacagagcttc^taatgctcaagcagatcaacattcaaaagagactgcattc 

ctagcatttp^gcagaaactactcgagctcaagatttagtcgaacaagt 

tgctaagctcaatgctatgattagcaatcctgatgctatcactaaagcagctcaaacggctaatgataat 

acaaaagcattaagctcagaattggagaaggctaaagctgacttagaaaatcaaam 
30 agcaattgactgaagagttggcagctcagaaagctgctct 

taaatcctcagctccgtctactcaagatagcattgtgggtaataataccatgaaagcaccx3 
cctcttgaagaacttaaaaaattagaagctagtggttatattggatcagct 
aagagcatgcagatcaaattattgccaaagctagtccagg^ 
agcagatcgtaatcgctttgttgatcccgataatttgac^ 
35 gcagctcacatgattaatagtgtaagaagacaatta 

aagaatttgcaagattacttagtaccagctataagaaaactcatggt^ 
cggacagccaggggtatcagggcattatggtgttgggcctcatgat^ 

ggagcgtcagggctcattcgaaatgatgataacatgtacgagaatatcggtgct 
ctctgaatggtattaaacgtggtatttat^ 

40 aaatacatacggccatgctattaactttttacgtgtagataaacataaccctaatgcg^ 
ggattttcaaccagcaatgtaggatctttgaatgaacact^ 

accatcaacgctttaataagacccctataaaagccgttggaagtacaaaagattatgcccaaa 
cactgtatctgatactattgc^gcx^ 

catcaagaagctgatattatggcagcccaagctaaagtaag 
. 45 ttaagcagtcagacagcttaAatctccaagtgagacaattaa^ 
attactagcagctaaagcaaaacaagcacaactcgaagct^ 

TCGTTGAAAGCCGCACTGC^CCAGACAGAAGCCTTAGCAGAGCAAGCCGCAGC 
TGGCTAAAAAA(XTCATTTGCAATATCTAAGGGACTTTAAATT^ 

TGAGCXK^TTGATAATACTAAGCAAGATTTGGCTAAAACTAC 
50 TTAGCAGCCTTAC^GCTAAACAAAGC^CTCTAGAAGCTACTATTGCTA 

TGCTTAAAACCTTAGCTAACGAAAAGGAATATCGCCACITAGACGAAGATATAGCTACTGTC 

GCAAGTAGCTCCACCTCTTACGGGCX5TAAAACCGCTATCATATAGTAAGATAGATACTACTCCGCTTC 

CAAGAAATGGTTAAAGAAACGAAACAACTATTAGAAGCTTCAGCAAGATTAGCTC 

TTGTAGCAGAAGCGCTTGTTGGCCAAACCTCTGAAATGGTAGCAAGTAATGCCA 
55 ATCTTCGATTACTCAGCCCTCATCTAAGACATCTTATGGCTCAGGATCCT 

TCTGATGTTGATGAAAGTACTCAAAGAGCTCTTAAAGCAGGAGTC 

CAGGATTTAGGTTCCGTAAGGAATCTAAGTGA ■ ~~ 
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Preferred GAS 040 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 9; and/or (b) which is a fragment of at least n 
consecutive amino acids ofSEQ ID NO: 9, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20,25, 
5 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). These GAS 040 proteins include variants 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, eta) of SEQ ID NO: 9. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 9. Other preferred fragments lack one or 
more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 
1 0 ID NO: 9. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 9 is removed. As another example, in one embodiment, the underlined amino acid 
sequence at the C-terminus of SEQ ID NO: 9 is removed. Other fragments omit one or more domains 
of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, of a transmembrane 
domain, or of an extracellular domain). 

* 

15 Further illustration of domains within GAS 40 is shown in FIGURES 1 and 2. As shown in these 
figures, GAS 40 contains a leader peptide sequence within amino acids 1 - 26, a coiled-coil region 
within amino acids 58 - 261, a coiled coil region within amino acids 556 - 733, a leucine zipper 
region within amino acids 673 - 701 and a transmembrane region within amino acids 855 - 866. 

The coiled-coil regions of GAS 40 are likely to be involved in the formation of oligomers such as 
20 dimers or trimers. Such oligomers could be homomers (containing two or more GAS 40 proteins 
oligomerized together) or heteromers (containing one or more additional GAS proteins oligomerized 
with GAS 40). 

Accordingly, in one embodiment, the combinations of the invention include a GAS 40 antigen in the 
form of an oligomer. The oligomer may comprise two more GAS 40 antigens or fragments thereof, or . 
25 it may comprise GAS 40 or a fragment thereof oligomerized to a second GAS antigen. Preferably, a 
GAS 40 fragment used within an oligomer includes a portion of one of the coiled coil or leucine 
zipper domains. 

» ■ 

< ■ 

(6) GAS 389 

GAS 389 corresponds to Ml GenBank accession numbers GI: 13622996 and GI: 1 5675772, to M3 
30 GenBank accession number GI: 2191 1237, to M18 GenBank accession number GI: 19746884, and is 
also referred to as 4 Spyl981' (Ml), € SpyM3_170r (M3), 'SpyM18_2045' (M18) and *relA\ GAS 
389 has also been identified as a (p)ppGpp synthetase. Amino acid and polynucleotide sequences of 
GAS 389 from an Ml strain are set forth below: 

SEQ ID NO: 11 

35 MRNEMAKIMNVTGEEVI AIAATYOTK PY I VH PI QVAG I LADLHL 

DAVTVArcFLHDVVEiyrDI TLDEI KADFGHD 

VILVKLADRLHNMRTLKHLRKDKQ SRIKWELEDLAFRYLNETEFYKI SHM 

MKE KRR ER EAL VEAI VS KVKT YTTQQGL FGDVYGR PKH I YS I YRKMRDKKKR FDQ I FDL I A I RCVME TQ S ' 
DVYAMVGY I HELWR PM PGR FKDY I AAPKANGYQS I HTTVYGPKGPI E I Q I RTKDMHQVAE YGVAAHWAYK 
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KGVRGKVNQAEQAVGMNWI KELVELQDASNGDAVDFVDS VKBDI FSBRI YVFTPTGAVQBLPKESGPI DF 
A YAI HTQ IGEKATGAKVNGRMVPLTAKLKTGD WBI I TN ANS FG PSRDWVKLVKTNKARN KI RQFFKNQD 
KELSVNKGRDLLVSYFXJEQGYVANKYLDKKRIEAI^ 

REEERAKAKAEAEELVKGGEVKHENKDVLKWSENGVIIQGASGLLMRIAKCC^ 
5 I AIHRSDCHNI KSQDGYQERLI EVBWDLDNSS KDYQAEI DI YGLNRSGLLNDVLQI LSNSTKS I STVNAQ ■ 
PTKDMKF AN IHVSFGI PNLTHLTTWEKI KAV PDVY S VKRTNG 

SEQ ID NO: 12 

ATGAGGAACGAAATGGCAAAAATAATGAACGTAACAGGAGAAGAAGTC 
1 0 TGACCAAGGCTGATGTGGCTTTTGTGGCAAAGGCTTTAGCATATGCAACAGCGGCCCATTTC 

GAGAAAGTCAGGCGAACCCTATATCGTCCATCCGATTCAGGTGGCGGGGATTCTGGCTGATTTC 
GATGCTGTGACAGTTGCTTGTGGCTTTTTACATGATC 

TCGAAGCAGACTTTGGCCATGATGCTCGTGATATCGTTGATGGTGTCACCAAGTTAGGTGAAG 

CAAATCTCATGAGGAGCAACTCGCCGAAAACCATCGCAAAATGCTGATGGCTATGTCCAM 
1 5 GTGATTTTGGTGAAATTGGCTGACCGCCTGCATAATATC 

AAGAGCGCATTTCX5CGCGAAACCATGGAAATCTATGCCCCCTTGGCGCATCGTTTGG 

CAAATCG<^CTAGAAGATTTGGCTTTTCGTTACCTCAA 

ATGAAAGAAAAACGTCGCGAGCGTGAAGCTTTGGTAGAGGCTATTC 

CACAACAAGGGTTGTTTGGAGATGTGTATGGCCGACCAAAACACATTTATTCGACT 
20 GGACAAAAAGAAACGATTCGATCAGATTTTTGATCTGATTGCCATTCG 

GATGTCTATGCTATGGTTGGCTATATTCATGAGC 

TTGCAGCTCCTTAAGCTAATGGCTACCAGTCTATTCATACCACCGTGTATGGGCCAAAAGGACCTATTGA 
GATTCAAATCAGAACTAAGGACATGCATCAAGTGGCTGAGTACGGGGTTGCTG^ 
AAAGGCGTGCGTGGTAAGGTCAATCAAGCTGAGC^GCCGTTGGCATGAACTGGAT 
25 AATTGCAAGATGCCTCAAATGGCGATGCAGTGGACTTTGTGGATTCG^ 

ACGGATTTATGTCTTTACACCGACAGGGGCCGTTCAGGAGTTACCAAAAGAATCAGGTCCTATTGATTTT 
GCTTATGCGATCCATACGCAAATCGGTGAAAAAGCAACAGGTGCCAAAGTCAATGGATCTATGGTTCCTC 
TCACTGCCAAGTTAAAAAGAGGAGATGTGGTTGAAATCATC^ 

AGACTGGGTAAAACTGGTCAAAACCAATAAGGCTCGCAACAAAATTCGTCAGTTCTTO 
3 0 AAGGAATTGTCAGTGAATAAAGGCCGTGATTTGTTGGTGTCTTATTTTCAAGAGCAGGGCT^ 

ATAAATACCTTGACAAAAAACGCATTGAAGCCATCCTTCCAAAAGTCAGTGTGAAG 

CTATGCAGCCGTTGGGTTTGGTGACATTAGTCCTATCAGTGTC^ 

CGTGAAGAAGAAAGGGCCAAGGCTAAAGCAGAAGCTGAAGAATTGGTTAAGGGCGG^ 

AAAACAAAGATGTGCTCAAGGTTCGCAGTGAAAATGGAGTCATTATC 
3 5 GCGGATTGCCAAGTGTTGTAATCCTGTACCTGGTGATCCTATTGACGGCTACATTACCAAAQ 

ATTGCGATTCACAGATCGGACTGTCATAACATTAAGAGTCAAGATGGCTACCAAGAACX3CT 

TCGACTGGGATTTGGACAATTCGAGTAAAGATTATCAGGCTGAAATTGATATCT^ 

TGGTCTGCTTAATGATGTWTCCAAATTTTATCAAACTC^ 

CCGACCAAGGACATGAAGTTTGCTAATATTCACGTGAGCTTTGGCATTCCAAATCTGA 
40 CTGTTGTCGAAAAAATCAAGGCAGTTCCAGATGTTTATAGCGTGAAGCGGACCAATGGCTAA 

Preferred GAS 389 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1 1 ; and/or (b) which is a fragment of at least n 

45 consecutive amino acids of SEQ ID NO: 1 1 , wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). These GAS 389 proteins include variants 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 1 1. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 1 1 . Other preferred fragments lack one or 
more amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the C-terminus and/or one 

50 or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 
ID NO: 1 1 . Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 




PATENT APPLICATION 
ATTY REF NO. PP20663.002 

(7) GAS 504 

GAS 304 corresponds to Ml GenBank accession numbers GI:13622806 and GI: 15675600, to M3 
GenBank accession number GI: 2191 1061 , to M18 GenBank accession number GI: 19746708, and is 
also referred to as 'Spyl751' (Ml), 'SpyM3J525\ 'SpyM18J823' (M18)and *febK\ GAS 504 
. 5 has also been identified as a putative trans-2-enoyl-ACP reductase II. Amino acid and polynucleotide 
sequences of GAS 504 of an Ml strain are set forth below: 

SEQ ID NO: 13 

MKTRITELLNIDYPI FQGGMAWVADGDLAGAVSNAGGLG 1 1 GGGNAPKE WKAN I DRVKAI TDR PFGVN I 
MLLSPFADDI VDLVIEEGVKVVtTGAGNPGKYMERLHQAGI I WPWPSVALAKRMEKLGVDAVI AEGME 
• 10 AGGH I GKLTTMS L VRQWEAVS I P V I AAGG I ADGHGAAAA FMLGAEA VQ I GTR FWAKE SN AHQN FKDKI 
LAAKDI DTVI SAQWGH PVRS I KNKLTSAYAKAEKAFLI GQKTATDI EEMGAGS LRHAV I EGDWNGS VM 
AGOI AGLVRKEESCBTILKDI YYGAARVI QNEAKRWQSVS I EK 

SEQ ID NO: 14 

1 5 ATGAAAACACGTATTACAGAATTACTTAATATTGATT^ 

CTGATGGTGATTTAGCAGGTGCAGTTTCTAATGCTGGTGGTTTAGGCA 

CAAAGAAGTCGTTAAAGCTAATATTGATCGTGTCAAAGCTAT^ 

ATGCTTTTATCTCCTTTTGCTGATGATATCG 

CAGGCGCAGGAAATCCAGGAAAGTATATGGAAAGACTGCACCAGGCGGGTATAATCGTTGTTCCTC 
20 CCCAAGCGTTGCGCTAGCCAAACGTATGGAAAAGCTTGGGGTAGATGCTC 
GCTGGAGGACATATTGGCAAGTTAACGACTATGTCTTT^^ 

CTGTCATTGCGGCAGGTGGTATAGCTGATGGTCATGGTGCAGCAGCAGC^T 
TGTTCAAATTGGAACTCGCTTTGTTGTTGCTAAAGAATCCAATGCT 
. TTAGCAGCAAAAGATATTGATACGGTGATTTCTGCGCA 
25 ATAAATTGACCTCAGCTTACGCTAAAGCAGAAAAAGCATTTTTAATT 

TGAAGAAATGGGAGCAGGATCGCTTCGACACGCTGTTATTGAAGGCGATGTAGTCAATGGATCTG 
GCTGGCOVAATTGCAGGGCTTGTGAGAAAAGAAG 

GTGCAGCTCGTGTTATTCAAAATGAAGCTAAGCGCTGGCAATCTGTTTCAATA 

30 Preferred GAS 504 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 13 ; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 13, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 504 proteins include variants (eg. allelic 

35 variants, horaologs, orthologs, paralogs, mutants, eta) of SEQ ID NO: 13. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 13. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terrainus and/or one or more amino 
acids(e.g. 1,2,3,4,5,6,7,8,9, 10, 15,20,25ormore)fromtheN-terminusofSEQIDNO: 13. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 

40 cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(8) GAS 509 

GAS 509 corresponds to Ml GenBank accession numbers GI: 13622692 and GI: 15675496, to M3 
GenBank accession number GI: 21910899, to M18 GenBank accession number GI: 19746544, and is 
also referred to as 'Spyl618' (Ml), *SpyM3_1363" (M3), «SpyM18_1627* (M18) and 'cysM'. GAS 
45 509 has also been identified as a putative O-acetylseruie lyase. Amino acid and polynucleotide 
sequences of GAS 509 of an Ml strain are set forth below: 
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SEQIDN0:1S 

MTKIYKTITELVGQTPIIKLNRLIPNEAADVYVKLEAFNPGSSVKDRI 

PTSGNTGI GLAWVGAAKGYRVI IVM PETMS LERRQI IQAYGAELVLT PGAEGMKGAI AKAETLAI ELGAW 
MPMQFNNPANPSIHEKTTAQEILEAFKEISLDAFVSGVGTGG 
5 LSGQE PG PHKI QG I SAGF I PNTLDTKAYDQI I RV KSKDALBTARLTGAKBG FLVGI SSGAALYAAI BVAK 
QLGKGKHVLTILPDNGBRYLSTELYDVPVIKTK 

SEQ ID NO: 16 

ATGACTAAAATTTACAAAACTATAACAGAATTAGTAGGTCAMCACCT 
] 0 TTCCAAACGAAGCTGCTGACGTTTATGTAAAATTAGAAGCTTTTAACCCAGGATCTTC 
TATTGCTTTATCGATGATTGAAGCTGCTGAAGCTGAA 

CCAACAAGTGGTAATACAGGTATTGGTCTTGCATGGGTAGGTGCTGCTAAAGGGTA 
TTATGCCCGAAACTATGAGCTTGGAAAGACGGCAAATCATTCAGGCTTATGGTGCAGA 

ACCTGGAGCAGAAGGTATGAAAGGGGCTATTGCAAAAGCTGAAACTTTAGCAATAGAACTAGGTC 
1 5 ATGCCTATGCAATTTAATAACCCTGCCAATCC^ . 

AAGCTTTTAAGGAGATTTCTTTAGATGCATTCGTATCTGGTGTTGGTACTG 

TTCACATGTCTTGAAAAAAGCTAACCCTGAAACTGTTATCTATGCTGTTGAAGCTGAAGAATCTC 

TTATCTGGTCAAGAGCCTGGACCACATAAAATTCAAGGTATAT 

ATACCAAAGCCTATGACCAAATTATCCGTGTTAAATCGAAAGATGCTTTAGAAACTGCTCGACTA^ 
20 AGCTAAGGAAGGC TTCCTGGTTGGGATTTCITTCTGGAGCTGCTCTTTACG 

CAGTTAGGAAAAGGCAAACATGTGTTAACTATTTTACCAGATAATGGCGAACGCT 
TCTATGATGTACCAGTAATTAAGACGAAATAA 

■ 

Preferred GAS 509 proteins for use with the invention comprise an amino acid sequence: (a) having 
25 50% or more identity (e.£. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 15; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 15, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 509 proteins include variants (e.g. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ IE) NO: 15. Preferred fragments of (b) 
30 comprise an epitope from SEQ ID NO: 15. Other preferred fragments lack one or more amino acids 
(e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9,. 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 15. For 
example, in one embodiment, the underlined amino acid sequence at the C-terminus of SEQ ID NO: 
15 is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
35 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(9) GAS 366 

GAS 366 corresponds to Ml GenBank accession numbers 01:13622612, Gl:15675424 and 
GI:30315979, to M3 GenBank accession number GI: 21910712, to M18 GenBank accession number 

* » 

GI: 19746474, and is also referred to as 'Spyl525' (Ml), 'SpyM3_l 176' (M3), 'SpyMl 8 J542* 
40 (Ml 8) and 'murD\ GAS 366 has also been identified as a UDP-N-acetylemuramoylalanine-D- 

glutamate ligase or a D-glutamic acid adding enzyme. Amino acid and polynucleotide sequences of 
GAS 366 of an Ml strain are set forth below: 

SEQ ID NO: 17 

MKVI SNFQNKKI LI LGIAKSGKAAAK LLTKIX3ALVTVNDSK^ 
45 ENFEYMVKNPGIPYDNPMVKRALAKEIPI^ 

ALLS GN I GY PASKWQKAI AGDTLVMBLS SFQLVGVNAPR PHI AV I TNLMPTHLDYHGSFEDYVAAKWMI 
QAQMTESDYLI LNANQBI S ATLAKTT KATVI PFSTQKWDGAYLKDGILYFKEQAJ IAATDLGVPGSHNI 
ENALATIAVAKLSGIADDIIAQCLSHFCk»VKHRLQR^ 

L I AGGLDRGNE FDDLVPDLLGLKQMIILGESAERM KRAAN KAE VS YLEARNVABAT E LAP KLAQTGDT I L . 
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LS PANAS WDM Y PNFEVRGDE FLATFDCLRGDA 

SEQ ID NO: 18 

ATGAAAGTGATAAGTAATTTTCAAAACAAAAAAATATTAATATT^ 
CAGCAAAATTATTGACCAAACTTGGTGCTTTAGTGACTGTTAAT^ 
5 AGCGGCACAAGCCTTGTTGGAAGAGGGGATTAAGGTCATTTGTGGTAGCCACC 
GAGAACTTTGAGTAWTGGTTAAAAACCCTGGGATTCCTTATGATAATCCT 
CAAAGGAAATTCCCATCTTGACTGAAGTAGAATTGGCHTAT 

TACAGGATCAAACGGGAAGACAACCACAACGACAATGATTGCCGATGTTTTGAATC 
GCACTCTTATCTGGAAAC^TTGGTTATCCTGCTTCAAAAGTTC 
1 0 TGGTGATGGAATTGTCCTCTTTTCAATTAGTGGGAGTGAATGCTT 

TAATTTAATGCCGACTCACCTGGACTATCATGGCAGTTTTGAGGATTATGTTGCTGCTA 

CAAGCTCAGATGACAGAATCAGACTACCTTATTTTAAATGCTAATCAAGAGAT^ 

AGACCACCAAAGCAACAGTGATTCCTTTTTCAACT 

AATACTCTATTTTAAAGAACAGGCGATTATAGCTGCAACTGACTTAGGTGTCCCAGGTAGCCACA^ 
1 5 GAAAATGCCCTAGCAACTATTGC^GTTGCCAAGTTATCTGGTATTGCTGATGATAT^ 

TTTCACATTTTGGAGGCGTTAAACATCGTTTGCAACGGGTTGGTCAAATCAAAGATATO 

* TGACAGTAAGTCAACCAATATTTTAGCCACTCAAAAAGCTTTATCAGG 

• TTGATTGCTGGCGGTCTAGATCGTGGCAATGAATTTGACX3ATTTGGTC 
AGATGATTATTTTGGGAGAATCCGCAGAGCGTATGAAGCGAGC 

20 TGAAGCTAGAAATGTGGCAGAAGCAACAGAGCTTGCTTTTA^ 

CTTAGCCCAGCCAATGCTAGCrcGGATATGTATCCTAATTTTGAGGTTCGTGGGGA 
CCTTTGATTGTTTAAGAGGAGATGCCTAA 

Preferred GAS 366 proteins for use with the invention comprise ah amino acid sequence: (a) having 
25 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1 7; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 17, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 366 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralbgs, mutants, etc.) of SEQ ID NO: 17. Preferred fragments of (b) 
30 comprise an epitope from SEQ ID NO: 17. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ED NO: 17. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 

* « 

17 is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
35 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(10) GAS 159 

GAS 159 corresponds to Ml GenBank accession numbers GL13622244 and GI:15675088, to M3 
. GenBank accession number GI: 2 1910303, to M18 GenBank accession number GI: 19746056, and is 
also referred to as 'Spyl 105' (Ml), € SpyM3J»767 r (M3) f 'SpyM18J067' (M18)and 4 potD\ GAS 

» 

40 159 has also been identified as a putative spermidine/putrescine ABC transporter (a periplasmic 
transport protein). Amino acid and polynucleotide sequences of GAS 159 of an Ml strain are set 
forth below: 

SEQ ID NO: 19 

M RKLYSFLAGVTjGVIVILTSLSFI LQKKSGSGSQSD . 
45 NEAMYTKIKQGGTTYDIAVPSDyriDKMIKEN^ 
TVGIVWDQLVT)KAPMHWEDLVrcPEY 
PNVKAI VADEMKGY*IIQGDAAIGI TFSGEASBMU)SN 

FLNF I NR PENAAQNAAYIGYATPNKKAKALLPDE I KNDPAFY PTDDI I KKLEVYDNLGSRWLG I YNDLYL 
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QFKMYRK 

SEQ ID NO: 20 

ATGCGTAAACTTTATTCCTTTCTAGCAGGAGTTTT^ 
5 TCTTGCAGAAAAAATCXXXSTTCTGGTACTCA^ 

TGATCCAGCTTTGCTCAAAAAATTCACCAAAGAAACGGGCATTGAAGTC 

AATGAAGCCATGTACACTAAAATCAAGCAGGGO^ 

CCATTGATAAAATGATCAAAGAAAACCTACTCAATAAGCTTG^ 

TATCGGGAAAGJU^TTTTTAGGGAAAAGCTTTGAC CCACAAAACGACTATTCTTTGCCTT ATTTCTGGGGA 
1 0 ACCGTTGGGATTGTTTATAATGATCAATTAGTTGA 

CAGAATATAAAAATAGTATTATGCTGAOTGATGGAGCGCGTGAAATGCTAGGGGTT 

TGGTTATAGTGTGAATTOTAAAAATCTAGAGC^GlTGCAGGttGC 

CCGAATGTTAAAGCCATTGTAGCAGATGAGATGAAAGGCTACATGATTC 

TTACC TTTTCTGGTGAAGC CAGTGAGATGTTAGATAGTAACGT^CACCTTCACTACATCGTGCCTTCAGA 
1 5 AGTCTCTAACCITrGGTTTGATAATTTGG 

TTTTTGAACTTTATCAATCGTCCTGAAAATCCTGCGCAAAATGCTGCATATATTGGTTA 
ATAAAAAAGCCAAGGCCTTACTTCCAGATGAGATAAAAAATGATCCTC 
TATCAAAAAATTGGMGTTTATGACAATTTAGGGTCAAGATGGTTGGGGATTT 
CAATTTAAAATGTATCGCAAATAA 

2Q • 

Preferred GAS 159 proteins for use with the invention comprise an amino acid sequence: (a) having 
•50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 19; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 19, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

25 30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 159 proteins include variants (eg. allelic 
variants, homolbgs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 19. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 19. Other preferred fragments lack one or more amino acids 
(e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 19. For 

30 example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 
19 is removed. In another example, the underlined amino acid sequence at the C-terminus of SEQ ID 
NO: 19 is removed. Other fragments omit one or more domains of the protein (eg. omission of a 
signal peptide, of a cytoplasmic domain', of a transmembrane domain, or of an extracellular domain). 

(11) GAS 217 

35 GAS 217 corresponds to Ml GenBank accession numbers GI:1 3622089 and GI: 15674945, to M3 
GenBank accession number GI: 21910174, to M18 GenBank accession number GI: 19745987, and is 
also referred to as *Spy0925' (Ml), *SpyM3J)638* (M3), and € SpyM18_0982' (M18). GAS 217 has 
also been identified as a putative oxidoreductase. Amino acid and polynucleotide sequences of GAS 
217 of an Ml strain are set forth below: 

40 SEQ ID NO: 21 

MAQRIIVITGASGGLAQAIVKQLPKEDSLILLGRNKBRLEHCYQHIDNKECLELDITN 

QR YGR IDVLI NNAG YGAFKGF EEFS AQE I ADMFQVNTLAS I HFACLI GQKMAEQGQGHL I N I VS MAGL I A 

SAKSSIYSATKFALIGFSNALRLBIADKGVYVTTVNro 

RLVSI IGKNKRELNLPFSLAVTHQFYTLFPKLSDYLARKVFNYK 

45 

SEQ ID NO: 22 

■ * • 

ATGGCACAAAGAATCATTGTTATCACGGGAGCTTCTGGAGGACT 
CCAAGGAAGACAGCTTGATTTTATTAGGACGTAACAAAGAACG 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

CAACAAAGAATGCCTCGAGTTGGATATTACCAATCCAGT^^ 

CAGCGCTATGGCOSTArrGATGTCTTGATTAATAATGCTGGCTAC 
TTTCTGCCCAAGAAATAGCTGATATGTTTCAGGTT 

TGGTCAGAAAATGGttGAGCAGGGGCAAGGTCACCTTATTAATATTGTCT 
5 TCAGCCAAATCGAGCATTTATTCAGCCACCAAGTTTGCCCTC 

AATTAGCGGATAAAGGGGTTTACGTGACCACCGTGAATCCAGGTC 
AGCTGACCCGTCTGGAC^TTATTTGGAAAGCGTTGGTAAATTTACTCT 
CGTTTGGTTTCTATTATCGGGAAAAATAAACGAGAATTGAATTTGCCCTTTA 
AATTTTACACCCTTTTCCCTAAATTATCTGATTATCT^ 

10 

Preferred GAS 217 proteins for use with the invention comprise an amino acid sequence: (a) having 
• 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 2 1 ; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 21, wherein n is 7 or more {e.g. 8, 10, 12, 14, 16, 18, 20, 25, 

15 30, 3 5, 40, 50, 60, 70, 80, 90, 1 00, or more). These GAS 2 1 7 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 21. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 2 1 . Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-tenninus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 21. 

20 Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(12) GAS 309 

GAS 309 corresponds to Ml GenBank accession numbers GI: 13621426 and GI: 1 5674341, to M3 
GenBank accession number GI: 21909633, to M18 GenBank accession number GI: 19745363, and is 
25 also referred to as 'Spy0124' (Ml), l SpyM3_0097 1 (M3), 'SpyMlSJEOS* (M18), 'nra' and *rofA\ 
GAS 309 has also been identified as a regulatory protein and a negative transcriptional regulator. 
Amino acid and polynucleotide sequences of GAS. 309 of an Ml strain are set forth below: 

SEQ ID NO: 23 

MIEKYLESSIESKCQLIVLFFKTSYLPITEVAEKTGLT^ 
30 .THPFKETYLYQLYASSNVLQLLAFLI KNGSH SR PLTDFARSHFLSNS S AYRMREAL I PLLRNFBLKLS KN 
KIVGEEYRIRYLIALLYSKFGIKVYDLTQQDKOTIHSFLSHSSra 
QFSVTIPQTRIFQQLKI^FVYDSLKKSSHDIIETYCQ 
QYCQLFEENDTFRLLLNPIITLLPNLKEQKAS 
TSLKLIVEBWMAKLPGKRDLNHKHFHLFCH^ 

35 . IDFHSYYLLQDNVYQI PDLKPDLVITHSQLI PFVHHELTKGI AVAEI SFDESILSIQELMYQVKEEKFQA 
DLTKQLT 

SEQ ID NO: 24 

TTGATAGAAAAATACITGGAATCATCAATCGAAT 
40 CTTATTTGCCAATAACTGAGGTAGCAGAAAAAACTGG^ 
GGAACTGAATGCCTTTCTCCCTGGTAGTCTGTCTATC 
ACACATCCTTTTAAAGAAACTTATCTTTACC 
TTTTAATAAAAAATGGTTCCCACTCTCGTCCCCTT 

CTCAGCTTATCGGATGCGCGAAGC^TTGATTCC . 
45 AAGATTGTCGGTGAGGAATATCGCATCCGTTACCTCATCGCT 

TTTATGACTTGACGCAGCAAGACAAAAAQVCTATTCATAGCTTT^ 
AACCTCTCCTTGGTTATCGGAATCGTTTTCTlTCTATGAtt 
CAATTTTCGGTAACTATTCCCCAAACCAGAATTTTTCAACAATTA 
T$AAAAAAAGTAGCCATGATATTATCGAAACTTACT^ 

50 CCTCTATrTAATTTATATCACCGCTAA^ 

• ■ 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

CAATATTGTCAACTTTTTGAAGAAAA 

CTAACCTAAAAGAGCAAAAGGCTAGTTTAGTAAAAGCTCTTATGI^^ 
TCTGCAACATTTTATTCCTGAGACCAACTTATTCGTTT 

ACGTCCTTAAAGTTAATTGTCGAAGAGTGGATGGCCAAA(nTCCTGGTAAGCGTGACT 
5 ATTTTCATCTTTTTTCCCACTATGTCGAGCAAAGTCT 

CGTAGCCAGTAATTTTATCAATGCTCATCTCCTAACGGATTCTTT^ 
ATTGATTTTCATTCCTATTATCTATTGCAAGATAATGTT^ 
TCATCACTCACAGTCAACTGATTCCTTTTGTTCACC^ 
ATCTTTTGATGAATCGATTCTGTCTATCCAAGAA 
1 0 GATTTAACCAAGCAATTAACATAA 

Preferred GAS 309 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ED NO: 23; and/or (b) which is a fragment of at least n 

15 consecutive amino acids of SEQ ID NO: 23, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 309 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 23. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 23. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 

20 acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 23. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(13) GAS 372 

GAS 372 corresponds to Ml GenBank accession numbers GL13622698 and GI:15675501, to M3 
25 GenBank accession number GI: 21910905, to M18 GenBank accession number Gl: 19746500 and is 
also referred to as 'Spyl625* (Ml), 'SpyM3J369* (M3), and 4 SpyM18_1634' (M18). GAS 372 has 
also been identified as a putative protein kinase or a putative eukaryotic-type serine/threonine kinase. 
Amino acid and polynucleotide sequences of GAS 372 of an Ml strain are set forth below: 

SEQ m NO: 25 

30 M I Q I GKLF AGRYR I LKS IGRGGMAD VYLANDL I LDNE DVAI KVLRTN YQTDQ VAVAR FQREARAMAELNH 
PNIVAIRDIGEEIX3QQFLVMEYVDGADLKRYIQNH 
NILLTKEGVVTCVTDFGIAV^FAETSL^ 
PTOGDSAVTIALQHFQKPLPSI IEENHNV 

RKI I FENVESTKPLPKVASGPTASVKLS PPTPTVXTQESRLDQTNQTDAI^PPTKKKKSGRFLGTLFKIL 
35 FS FFI VGVALFTYLILTKPTSVKVPNVAGTSLKVAKQELYDVGLKVGKI RQI ESDTVAEGNWRTDPKAG 
TAKRQGSS I TLYVS IGNKGFDMENYKGLDYQEAMNSLI ETYGVPKSKI KI ERI VTNEYPENTVI SQS PSA 
GDKFNPNGKSKI TLSVAVSDTI TMPMVTEYS YADAVNTLTALGIDASR I KAYVPS SSS ATGFVPI HS PSS 

KAIVSGQSPYYGTSLSLSDKGEISLYLYPEETHSSSSSSSSTSSSNSSSINDSTAPGSNTELSPSETTSQ 
TP 

40 

SEQ ID NO: 26 

ATGATTCAGATTGGCAAATTATTTGCTGGTCGTTATC 
CGGATGTTTATTTAGCAAATGACTTGATCTTGGATAAT^ 
TTATCAAACAGATCAGGTAGCAGTTGCX3C^TTTCCAACGAGAAGCG 
45 CCCAATATTGTTGCC^TCCGGGATATA<k^ 

ATGGTGCTGACCTAAAGAGATACATTCAAAATCATGCTCCATTATCTAATAAT^ 

GGAAGAAGTCCTTTCTGCTATGACTTTAGCCCACCAAAAAGGAATTCT 

AATATCCTACTAACTAAGGAGGGTGTTGTCAAAGT^ 

CAAGCTTGACAQUWVCTAATTCGATGTTAGGCAGTGTTCAT^ 

50 CAAAGCGACGATTCAAAGTGATATTTATGCGATGGGGATTATGCTCTT^ 

■ • 
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PATENT APPLICATION 
ATTY REF NO. PP20663 .002 

CCTTATGACGGCGATAGTGCTGTTACGATTGCCTTGCAAC^ 
AGGAGAACCACAATGTGCCACAAGCTTTGGAGAATGTTGT^ 

TCGTTACGGGTCAACCTTTGAAATGAGTCGTGACTTAATGACGGCGCTTAGTTATAATC 

CGTAAGATTATCITTGAGAATGTTGAAAGTACCAAACCCCTCCCCAAAGT^ 
5 CTGTAAAATTGTCTCCCCCTACCCCAACAGTCTTAACACAGGAAAGTCGATTAGATCAAACT 

AGATGCTTTACAGCCCCCCACCAAAAAGAAAAAAAGTGGTOSTTTm 

TTTTCTTTCTTTATTGTAGGTGTAGCACTCTTTACTTATCTTATACTAACTAAA 

TTCCTAATGTAGCAGGCACTAGTCTTAAAGTTGCCAAACAAGAACTGTATGATGTTC 

TAAAATCAGGCAAATTGAGAGTGATACGGTTGCTGAGGGAAATGTAGTTAGAACA 
1 0 ACAGCTAAGAGGCAAGGCTCAAGCATTACGCTTTATGTGTCAATTGG 

ACTACAAAGGACTAGATTATCAAGAAGCTATGAATAGTTTGAT^^ 

AATCAAAATTGAGCGCATTGTAACTAATGAATATCCrrGAAAATACAGTCATCAGTCAA 

GGTGATAAATTTAATCCAAACGGAAAGTCTAAAATTACGCTCAGTGTTGCTGTTAGTGATACGATCACTA 

TGCCTATGGTAACAGAATATAGTTATGCAGATGCAGTCAATACCTTAACAGCTTTAGG 
1 5 TAGAATAAAAGCTTATGTGCCAAGCTCTAGCTCAGCAACGGGCTTTC 

AAAGCTATTGTCAGTGGTCAATCTCCTTACTATGGAACGTCTTTGAGTCTGTCTGATAAAGC^ 

GTCTTTACCTTTATCCAGAAGAAACACACTCTTCTAGTAGCTCATCGAGTTCAACGTCT^ 

TTCITC^TAAATGATAGTACTGCACCAGGTAGCAACACTGAATTAAGCCCATCAGAAACT 

ACACCTTAA 

20 

Preferred GAS 372 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 25; and/pr (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 25, wherein n is 7 or more (eg. 8, 10^ 12, 14, 16, 18, 20, 25, 

25 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). These GAS 372 proteins include variants 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 25. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 25. Other preferred fragments lack one or 
more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 

30 ID NO: 25. Other fragments omit one or more domains of the protein (e.g. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

■ 

(14) GAS 039 

GAS 039 corresponds to Ml GenBank accession numbers GI: 13621542 and GI: 15674446, to M3 
GenBank accession number GI: 21909730, to M18 GenBank accession number GI: 19745398 and is 
35 also referred to as < Spy0266* (Ml), 'SpyM3_0194* (M3), and 'SpyMlSJESO' (M18). Amino acid 
and polynucleotide sequences of GAS 039 of an Ml strain are set forth below: 

SEQ ID NO: 27 

• ■ 

MDLILFLLVLVLLGLGAYLLFKVNGLQHQLAQTL^ 
40 LYWLTDIRDVLHRSLSDSRDRSDKRLE^ 

SFDSVSKQLESWKGLGEMRSVAQDVGTLNKVLSNT^ 

SERVEYAIKLPGNGQGGYIYLPIDSKFPLEDYYRLEDAYEVGDKIAIEASRKALI^IKRFAKDIHKKYL 
NPPETTNFGVMFLPTEGLYSEVVRKASFFDSLR^ 

KI LGNVKLEFDKFGGLLAKAQKQMNTANNTLDQLI STRTNAI VRAUnVETYQDQATKSLLNMPLLEEEN 
45 NEN 

SEQ ID NO: 28 

ATGGACCTTATCTTGTTCCTTTTGGTCTTGGTTC 
ACGGCCTTCAACATCAGCTTGCCCAAACCCTAGAAGGCA^ 

50 CCAGTTGGATACAGCTAACAAACAACAATTGTTAGAGCTAACACAGCT^ 
(HTTACCAACAATTAACAGATATTCGTGACGTCTTGC^ 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

ACAAACGCTTAGAAAAAATTAACCAGCAGGTCAACCAATCGCTCAAAAAT^ 

ACGTTTGGAGAAAATGCGCCAGATCGTTGAAGAAAAATTG^ 

TCTTTCXiATTCTGTATCCAAGCAACTAGAAAGTGTCAATAAAGGCT^ 

AAGATGTGGGTACTTTAAATAAGGTTTTGTCCAATAC 

5 AGGCCAAATCATTGAGGATATCATGACATCAAGCCAGTACGAAAGAGAATTTGTAA 

AGTGAACGCGTAGAATATGCGATTAAGCTCCCAGGAAATGGTCAAGGCGG 

ACTCAAAATTCCCTCTTGAAGATTATTACCGATTAGAAGATGCTTACGAA 

CGAGGCTAGCCGAAAAGCACTTCTGGCAGCTATCAAACGCTTTGCCA^ 

AACCCCCCAGAGACGACCAATTTCGGAGTTATGTTCTTACCAACAGAAGGTCTTTATTCAC^^ 

10 GAAATGCGTCTTTCTTTGATAGCCTTCGTCGGGAAGAAAATATC^ 

TGCTTTGCTGAATTCCTTATCTGTTGGTTTCAAGACCCTTAATATCCAAAAA 

AAAATTTTAGGCAATGTOVAGTTAGAATTCGATAAATTTGGCG^ 

TGAATACAGCTAATAATACGCTGGATCAGCTCATTTCAACAAGGACAAATGCCA 

TACCGTTGAAACTTATCAAGACCAAGCAACAAAATCTCTCTTGAACATGCCCTTATTAG 
15 AATGAAAATTAA 

Preferred GAS 039 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

* * 

97%, 98%, 99%, 99.5% or more) to SEQ ED NO: 27; and/or (b) which is a fragment of at least n 
20 consecutive amino acids of SEQ ID NO: 27, wherein n is 7 or more. (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 039 proteins include variants (eg. 
. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 27. Preferred fragments 

of (b) comprise an epitope from SEQ ID NO: 27. Other preferred fragments lack one or more amino 

acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terrainus and/or one or more 
25 amino acids (e.g. 1,2, 3, 4,5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-tenninus of SEQ ID 

NO: 27. Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, 

* 

of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 
(15) GAS 042 

, GAS 042 corresponds to Ml GenBank accession numbers GH3621559 and GI:15674461. to M3 
30 GenBank accession number GI: 21909745, to M18 GenBank accession number GI: 19745415, and is 
also referred to as , Spy0287 > (Ml), 'SpyM3_0209' (M3), and 'SpyM18_0275' (M18). Amino acid 
and polynucleotide sequences of GAS 042 of an Ml strain are set forth below: 

SEQ ID NO: 29 

MTKEKLVAFSQAHAEPAWI^ERRLAALEAI PNLELPTI ERVKPHRWNLGDGTLTENESLAS VPDFI AIGD 
35 NPKLVQVGTQTVLEQLPMALIDKGWFSDFYTA^ 

YVPDHLBITTPIEAIFLQDSDSDVPFNKHVLVIAGKESKF^ 

QI KFSAIDRI/3PSVTTYI SRRGRLEKDANIDWAIAVMNEGNVI ADFDSDLIGQGSQADLKVVAASSGRQV 
QGIDTRVTNYGQRTVGHI LQHGVI LERGTl/TFNGI GH I LKDAKGADAQQESRVIiMLSDQARADANPI LLI 
DENEVTAGHAASIGQVDPEDMYYLMSRGLDQETAERLVIRGFLGAVI AEI PI PSVRQEI I KVLDEKLLNR 

40 

SEQ D> NO: 30 

ATGACAAAAGAAAAACTAGTGGCTTTTTCGCAAGCC 

TAGCGGCATTAGAAGCC^TTCCAAATTTGGAA 

TCTAGGAGATGGTACCTTAACAGAAAATGAAAGTCTAGCT^ 

45 AACCCA AAGC TTGTTCAGGTAGGCACGCAAACAGTCTTAGAACAGTTA 
GAGTTGTTTTCAGTGATTTTTATACGGCE 

GGCATTAGCTTTTGATGAAGACAAACTAGCT.GCCTAC 
TACGTTCCTGATCACTTGGAAATCACAACTCCTATTGAAGCT 
TTCCTTTTAACAAGCATGTTCTAGTGATTGCAGGAAAAGAAA 
50 ATCTATTGGCAATGGCACTCAAAAGATCAGCGCTAATATCAGTGTAGAAGTGATTC 

CAGATTAAATTCTCGGCTATCGACCGCTTAGGTCCTTCAGTGACAACCTATATTAGCTC 
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PATENT APPLICATION 
ATTY REF NO. PP20663 .002 

TAGAGAAGGATGCCAACATTGATTGGGCCT^ 

C AGTGATTTGATTGGTCAGGGCTCACAAGCTGATTTGAAAGT^ 

GAAGGTATTGACACGCGCGTGACCAACTATGGTCAACGT^ 

.TTTTGGAACGTGGCACCTTAACGTTTAACGGGATTGGTCATATTCTAAAA 
5 TCAACAAGAAAGCCGTGTTTTGATGCTTTCn^ 

GATGAAAATGAAGTAAC^GC^GGTCATGGAGCTTCTATC 
TGATGAGTCGAGGACTGGATCAAGAAACAGCAGAACGATTGGTT 

CGCTGAAATTCCTATTCCATCAGTCCGCCAAGAGATTATTAAGGTTTTAGATGAGAAAT^ 
TAA 

10 

Preferred GAS 042 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 29; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 29, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
15 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 042 proteins include variants (e.g. 

4 

allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 29. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 29. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or.more) from the C-terminus and/or one or more 

■ 

amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
20 NO: 29. Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(16)GAS058 

GAS 058 corresponds to Ml GenBank accession numbers GI: 13621663 and GI:15674556, to M3 
GenBank accession number GI: 21909841, to Ml 8 GenBank accession number GI: 19745567 and is 
25 also referred to as 4 Spy043(T (Ml), 4 SpyM3_0305* (M3), and 4 SpyM18_0477' (M18). Amino acid 
and polynucleotide sequences of GAS 058 of an Ml strain are set forth below: 

* 

SEQ ID NO; 31 

M KWSGFMKTKSKRFLNIATLCLALLGTT 

GYLEGYEKGLKGDDI PERPKIQVPEDVQPSDHGDYRDGYEEGFGEGQHKRDPLETEAEDDSQGGRQEGRQ 
30 GHQEGADS SDLNVEBSDGLSVI DEWGVI YQAFSTI WTYLSGLF 

SEQ ED NO: 32 

ATGAAATGGAGTGGTTTTATGAAAACAAAATCAAAACGCTTTTT 
TACTAGGAACAACTTTGCTAATGGCAC^TCCCGTACAGG 
3 5 TCGCTTCGGGTTAGGCGATTTAGAAGATGATTCAGCT 

GGATATTTAGAGGGATATGAAAAAGGCTTAAAAGGAG&TGATAT 
CTGAGGATGTTCAGCCATCTGACCATGGCGACTATAGAGA 

ACATAAACGTGATCCATTAGAAACAGAAGCAGAAGATGATTCTCAAGGAGGACGTCAAG 
GGACATCAAGAAGGAGCAGATTCTAGTGATTTCAACGTTGA 
40 AAGTAGTTGGAGTAATTTATCAAGCATTTAGTACTATTTGGACATACTT 

Preferred GAS 058 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 31; and/or (b) which is a fragment of at least n 
45 consecutive amino acids of SEQ ID NO: 31, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 058 proteins include variants (e.g. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 31. Preferred fragments 
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of (b) comprise an epitope from SEQ ID NO: 31. Other preferred fragments lack one or more amino 
acids (e.g.. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-termirius of SEQ ID 
NO: 3 1 . For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
5 SEQ ID NO: 3 1 is removed. Other fragments omit one or more domains of the protein (eg. omission 
* of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 

(17) GAS 290 

GAS 290 corresponds to Ml GenBank accession numbers GI: 1 3622978 and GI: 1 5675757, to M3 
10 GenBank accession number GI: 2191 1221, to M18 GenBank accession number GI: 19746869 and is. 
also referred to as 'Spyl959* (Ml), 'SpyM3_1685' (M3), and 'SpyMl 8_2026' (M18). Amino acid 
and polynucleotide sequences of GAS 290 of an Ml strain are set forth below: 

SEQ ID NO: 33 

MKHI LF I VGSLREGSFMHQLAAQAQKALEHQAWSYLNWKDVPVLNQDI EANAPLPWDARQAVQS ADAI 
1 5 WI FTPVYNFSI PGS VKNLLDWLSRALDLS D PTGP S A I GGKVVTVS SVANGGHDQ VFDQF KALL PF I RTS V 
AGEFTKATVNPDAWGTGRLEI SKETKANLLSQAEALLAAI 

SEQ ID NO: 34 

ATGAAACATATTTTATTTATTGTTGGCTCGCTTCGTC 
20 CACAAAAAGCTCTGGAACATCAAGCAGTTGTATCTTACTTAAATTGGAAAGATC 
AGATATCGAAGCTAATGCACCTTTACCAGTTGTTGACG 
TGGATTTTTACACCAGTTTACAACTTCTCTATTCCAGGTTCTGTTA 

GTGCTCTTGATTTGTCTGATCCGACGGGCCCATCTGCTATTGGCGGTAAGGTGGTTACGGTCTCTTCAGT 
TGCAAATGGCGGGCATGATCAAGTATTTGATCAGTTTAAAGGACTATTC 
25' GCAGGAGAGTTTACAAAAGCAACTGTGAATCCTGATGCCTGG 

AGACAAAAGCAAACTTGCTATCTCAGGCAGAGGCTCTTTTAGCGGCTATTTAG 

* 

Preferred GAS 290 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

30 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 33; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 33, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
' 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 290 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 33. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 33. Other preferred fragments lack one or more amino acids 

35 (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 33. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(18) GAS 511 

40 GAS 511 corresponds to Ml GenBank accession numbers GH3622798 and GI:15675592, to M3 
GenBank accession number GI: 2191 1053, to Ml 8 GenBank accession number GI: 19746700 and is 
also referred to as 'Spyl743 f (Ml), 'SpyM3J517' (M3), 'SpyM18J815' (M18) and 4 accA\ Amino 
acid and polynucleotide sequences of GAS 51 1 of an Ml strain are set forth below: 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

■ 

SEQ ID NO: 35 

MTDVSRILKEARDQGRLTTLDYANLIFDDFW 
NLARNFGQPNPEGYRKALRimQAEKFGRPVW 
AI I IGEGGSGGALAIAVADQVVMLENTMYAVLS 
5 I I PEHGYF S SE I VDI I KAN L I EQI TSLQAKPLDQLLDER YQRFRKY 

■ 

SEQ ID NO: 36 

ATGACAGATGTATCAAGAATTTTAAAAGAAGCGCGTGATCAAG . 
ACCTTATTTTCGATGACTTTATGGAACTGCAT^ 
1 0 TGGCCTAGCTTATTTGGCGGGACAACCTGTTACGGTCA 

AATTTGGCAAGGAATTTTGGCCAGCCCAATCCAGAAGGTTATCGTAAAGC 
CAGAAAAATTTGGACGACCAGTTGTTACGTTTATCAATACTGCAGGAGCC^ 

AGAACGAGGACAGGGTGAGGC CATTG CTAAAAATTTGATGGAAATGAGTGATCTCAAGGTTCC CATTATC 
GeCATCATTATTGGTGAAGGAGGCTCTGGTGGTGCATTAGCCTTAGCGGTTGCCGATC 
1 5 TTGAAAATACTATGTATGCGGTTCTTAGCCCAGAAGGCTTTGCTTCTATTTTATGGAAGGA 
GGCGACCGAGGCCGCTGAATTGATGAAAATCACAGCGGGTGAACT 
ATTATTCCAGAACATGGTTATTTTTCAAGTGAAATC^ 
TAACCAGTTTGCAAGCTAAGCC^TTAGACCAATTATTAGATGAGCGCrA 
A . 

20 ■ 

Preferred GAS 511 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 35; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ DO NO: 35, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
25 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 5 1 1 proteins include variants (eg. allelic 

variants, homologs, orthologs, paralogs, mutants, eta) of SEQ ID NQ: 35. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 35. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 35. 
.30 Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(19) GAS 533 

* » 

GAS 533 corresponds to Ml GenBank accession numbers GL13622912 and GL15675696, to M3 
GenBank accession number GI: 219lil57,to Ml 8 GenBank accession number GI: 19746804 and is 
35 also referred to as *Spyi87T (Ml), 'SpyM3_162F (M3), 'SpyMl 8.1942' (Ml 8) and «glnA\ GAS 
533 has also been identified as a putative glutamine synthetase. Amino acid and polynucleotide 
sequences of GAS 533 of an Ml strain are set forth below: 

SEQ ID NO: 37 

. MAITVADIRREVTCEKNVTFIjRLMFTDIMGTO 
40 LYPDLDTWIVFPWGDENGAVAGLICDIYTAEGKPFAGDPRGNL 

MDDKGN PTLEVNDNGGY FDLAP I DLADNTRREI VNI LT KMGFE VEASHHEVAVGQHE I DFKYADVLKACD 
NIQIFKLVinCTIAREHGLYATFMAKPKFGIAGSGMHC^ 
..' GLMKHAYNYTAI TNPTVNS YKRLVPGYEAPVYVAWAGSNRS PLI RVPASRGMGTRLELRS VDPTANP YLA 
LAVLLEAGLDGI I NKI EAPE PVEAN I YTMTMEERNEAG 1 1 DL P STLHN ALKALQ KDD WQ KALG YH I YTN 
45 FLEAKRIEWSSYATFVSQWEIDHYIHNY 

• • * ■ 

SEQ ID NO: 38 

ATGGCAATAACAGTAGCTGACATTCGTCGTGA . . 

TCACTGATATCATGGGCGTTATGAAAAATGTGGAGATTCCTGCAACT 
50 GTCTAACAAGGTTATGTTTGATGGTTCATCTATCGAAGGTTTTGTACGGATCA^ 

-20- 



PATENT APPLICATION 
ATTY REF NO. PP20663.002 

CTTTACCCCGATTTAGACACTTGGATTGTTTTTCCC^ 

TTTGTGATATTTATACAGCAGAAGGAAAGCCTTTTGCAGGAGATCCTAGAGGAA^ 
GAAAC^CATGAACGAGATCGGCTACAAATCATTTAATCTTGGACCAGAACCAGAATT 

ATGGATGATAAAGGTAATCCXSACACTTGAAGTTAACGATAATGGTGGTTAT^ 
ACTTAGCAGACAAC^CGCGCCGTGAAATTGTGAATATTTTAACGAA^ 
TCATCATGAAGTGGCTGTTGGTCAACATGAGATO 
AATATTCAAATTTTTAAGCTAGTTGTAAAAACGATTGC^ 

CTAAACCAA^TTTGGAATAGCTGGATCAGGGATGCACTGTAACATGTCT 
TAATGCTTTTTATGATGAAGCTCATAAGCGAGGGATC 
GGACTAATGAAGCATGCTTATAACTACACTGCTATCACT^ 
TTCCAGGTTATGAGGCACCTGTTTATGTCGCITGGGCTGGA^ 

AGCATCACGTGGTATGGGAACGCGTTTGGAGTTACGTTCGGTTGATCCGACAGCTAATCCTTATTTAGCC 
TTGGCTGTTCTCTTGGAAGCTGGATTAGATGGTATCAT^ 
CTAACATTTATACCATGACAATGGAAGAACGAAATGAAGC^ 

TAATGCCrTAAAAGCTCTTCAAAAAGATGATGTGGTACAAAAGGCACTAGGTTACCATATCT^ 
■ TTCTTAGAAGCAAAACGAATTGAATGGTCTTCCTATGCAACTTTTG 
ATATTCATAATTATTAG 

Preferred GAS 533 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 37; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 37, wherein n is 7 or more (eg. 8, 10, 12, 14, 1 6, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 533 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 37. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 37. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ED 
NO: 37. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

30 (20) GAS 527 

GAS 527 corresponds to Ml GenBank accession numbers GI:13622332, GI:15675169, and 
GI:2421 1 764, to M3 GenBank accession number GI: 21910381 , to M18 GenBank accession number 
GI: 19746136, and is also referred to as 'Spyl204' (Ml), 4 SpyM3 J)845* (M3), *SpyM18J 155' 
(Ml 8) and *guaA\ GAS 527 has also been identified as a putative GMP synthetase (glutamate 
35 hydrolyzing) (glutamate amidotransferase). Amino acid and polynucleotide sequences of GAS 527 of 
an Ml strain are set forth below: 

SEQ ID NO: 39 

MTEISILNDVQKIIVLDYGSQYNQLIARRIREFGW 

AFG I DPE I FBLGI PILGI CYGMQLITHKIX3GKVVPAGQAGNREYGQS TLHLRETS KLFSGTPQEQLVLMS 
40 HGDAVTEI PEGFHLVGDSNDCPYAAIENTBKNLYGIQFHPEVRHSVYGNDILKNFAIS ICGARGDWSMDN 
FI DMEIAKIRETVGDRKVLLGLSGGVDSSVVGVLIiQKAIGDQLTCI FVDHGLLRKDEGDQVMGMLGGKFG 
LNI IRVDASKRFLDLLADVEDPEKKRKI IGNEFVYVFDDEASKLKGVDFLAQGTLYTDI IESGTETAQTI 
KSHHNVGGLPEDMQFELIEPLNTLFKDEVRAIiGIAI 

ESDAILREEI AKAGLDRDVWQYFTVirrGVRSVGVMGDGRTYDYTI AIRAITS I DGMTADFAQLPWDVLKK 
45 I STRI VNEVDHVNRI VYDITSKPPATVEWE 

• ♦ 

SEQ ED NO: 40 

ATGACTCAAATTTCAATTTTGAATGATGTTCAAAAAATTATCGT^ 

AGCTTATTGCTAGACGTATTCGAGAGTTTGGTGTTTTCTCCGAACTAAAAAGCCATAAAATCACCGCTCA 
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PATENT APPLICATION 
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AGAACTTCGTGAGATCAATCCCATAGGTATCGTTTTATCAGGAGGG^^ 
GCCTTTGGCATTGACCCTGAAATCTTTGAACTAGGGATTC 

TAATCACCCATAAATTAGGTGGTAAAGTTGTTCCTGCTGGACAAGCTGGTAATCGT 

AACCCTTCATCTTCGTGAAACGTCAAAATTATTTTCAGGCACACCTCAAGAA 
5 CATGGTGATGCTGTTACTGAAATTCCAGAAGGTTTC^ 

CAGCTATTGA7VAATACTGAGAAAAACCTTTACGGTATTCAGTTCCACCCAGAAGTGAGACACTCTC 

TGGAAATGACATTCTTAAAAACTTTGCTATATCAATTTGTGGCG^ 

TTTATTGACATGGAAATTGCTAAAATTCGTGAAACTGTAGGCGATCGTA 

GTGGAGTTGATTCTTCAGTTGTTGGTGTTCTACTTCAAAAAGCTATCGGTGACCAATTA^ 
1 0 CGTTGATCACGGTCTTCTTCGTAAAGACGAGGGCGAT 

CTAAATATTATCCGTGTGGATGCTTCAAAACGTTTCTTAGACCTTCTTGCAGAC 

AAAAACGTAAAATTATTGGTAATGAATTTGTCTATGTTTTTGATGATGAAGCC 

TGACTTCCTTGCCCAAGGAACACTTTATACTGAT^^ 

AAATCACATCACAATGTGGGTGGTCTCCCCGAAGACATGCAGTTTC 
1 5 TTTTCAAAGATGAAGTTCGAGCGCTTGGAATCGCTCTC 

ATTTCCAGGTCCTGGACTTGCTATCCGTGTCATGGGAGCAATTACTGA^ 
GAATCAGACGCTATCCTTCGTGAAGAAATTGCTAAGGCTGGACTT^ 

CAGTTAACACAGGTGTCCGTTCTGTAGGCGTCATGGGAGATGGTCGTACTTATGATTATACCATCGCCAT 

TCGTGCTATTACGTCTATTGATGGTATGACAGCTGACTTTGCTCAACTTCCTTGGGATG 
20 ATCTCAACATOTATCGTAAATGAAGTTGACCACGTTAACCGT^ 
CCGCAACAGTTGAATGGGAATAA 

Preferred GAS 527 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

25 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 39; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 39, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, 200 or more). These GAS 527 proteins include variants (e.g. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 39. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 39. Other preferred fragments lack one or more amino 

30 acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the.N-terminus of SEQ ID 
NO: 39. Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, 

of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). . 

■ 

(21) GAS 294 

35 GAS 294 corresponds to Ml GehBank accession numbers GI:13622306, GI:15675145, and 

GI:26006773, to M3 GenBarik accession number GI: 21910357, to Ml 8 GenBank accession number 
GI: 19746111 and is also referred to as 'Spyll73' (Ml), 'SpyM3JD82r (M3), 'SpyM18J125' 
(Ml 8) and *gjd'. GAS 294 has also been identified as a putative glucose-inhibited division protein. 
Amino acid and polynucleotide sequences of GAS 294 of an Ml strain are set forth below: 

40 SEQ ID NO: 41 

MSQSTAT Y INVIGAGLAGSEAAYQIAKRGI PVKLYEMRGVKATPQHKTTNFAELVCSNS FRGDSLTNAVG 
LLKEEMRRLDS 1 1 MRNGEANRV PAGGAMAVDREGY AE S VT AELENH PL I EV I RGE I TE I PDDAITVIATG 
PLTSDALAEKIHALNGGDGFYFYDAAAPIIDKS^ 

TTAEEAPLNAFEKEKYFEGCMP I EVMAKRGI KTMLYGPMKPVGLE Y PDD YTGPRDGEFKT PYAWQLRQD 
45 NAAGSLYNI VGFQTHLKWGEQKRVFQMI PGLEN AE FVRYGVMHRNS YMDS PNLLTETFQSRSNPNLF FAG 
QMTGVEGYVESAASGLVAGINAARLFKREEALI F PQTTA I GSLPHYVTHADSKHFQPMNVNFGI I KELEG 
PRI RDKKERYEAI ASRALADLDTCLASL 

SEQ ID NO: 42 

50 TTGTCTCAATCAACTGCMCTTATATTAATGTTAT 
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PATENT APPLICATION 
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AGATTGCTAAGCGCGGTATCCCCGTTAAATTGTATGAAATGCGTGGTGTCAAAG 
AACC^CTAATTTTGCCGAATTGGTCTGTTCCAACT 

CTTCTCAAAGAAGAAATGCGGCGATTAGACTCCATTATTATGCGTAATGGTC 
CTGGGGGAGCAATGGCTGTTGACCGTGAGGGGTATGCAGAGAGTC 
5 TCTCATTGAGGTCATTCGTGGTGAAATTACAGJUVATCCCTGACGATC 
CCGCTGACTTCGGATGCCCTGGCAGAAAAAATTCACGCGCT 

ATGCAGCAGCGCCTATCATTGATAAATCTACCATTGATATGAGCAAGGTTTACCTTAAAT 
TAAAGGCGAAGCTGCTTACCTCr\ACTGCCCTATGAC 
ACAACCGCAGAAGAAGCCCCGCTGAATGCCTTTGAAAAAGAAAAGTAT 
1 0 AAGTTATGGCTAAACGTGGC^TTAAAACCATGCTTTATGGACCT 

AGATGACTATACAGGTCCTCGCGATGGAGAATTTAAAACGCCATATGCCGTCGTGCAATTGCGTCAAGAT 

AATGCAGCTGGAAGCCTTTATAATATCGTTGGTTTCCAAACC^ 

TTTTCCAAATGATTCCAGGGCTTGAAAATGCTGAGTTTGTCCGCTACGGC 

TATGGATTCACCAAATCTTTTAACCGAAACCTTCCAATCTCGGAGCAATCCAAACCTT^^ ■ 
1 5 • CAGATGACTGGAGTTGAAGGTTATGTCGAATCAGCTGCTTCAGG 

GTTTGTTCAAAAGAGAAGAAGCACTTATTTTTCCTCAGACAACAK 
GACTCATGCCGACAGTAAGCATTTCCAACCAATGAACGTCAACTTT^^ 

CCACGCATTCGTGACAAAAAAGAACGTTATGAAGCTATTGCTAGTCGTGCTTTGGCAGATTTAGACACCT 
GCTTAGCGTCGCTTTAA ' 

20 

Preferred GAS 294 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, - 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 4 1 ; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 41, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 

25 30, 35, 40, 50, 60, 70, 80, 90, 1 00, 150, 200 or more), These GAS 294 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ED NO: 41. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 41. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terrninus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

30 NO: 41 . Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

♦ 

(22) GAS 253 

GAS 253 corresponds to Ml GenBank accession numbers GI: 1362261 1, GI: 15675423, and 
GI:21362716, to M3 GenBank accession number GI: 2191071 1, to M18 GenBank accession number 
35 GI: 19746473 and is also referred to as 'Spyl524' (Ml), 'SpyM3_l 175 J (M3), 'SpyM18_1541' 
(Ml 8) and 'murG\ GAS 253 has also been identified as a putative undecaprenyl-PP-MurNAc- 
pentapeptide-UDPGlcNAc GlcNAc transferase. Amino acid and polynucleotide sequences of GAS 
253 of an Ml strain are set forth below: 

SEQ ED NO: 43 

40 MPKKILFTGGGTVGHVTLNLILIPKFIKIXSW^ 

QNLADVFKVALGLLQSLF I VAKLRPQALFS KGGFVSVPPWAAKLMKPVF IHESDRSMGLANKI AYKFA 
TTMYTTFEQEDQLSKVKHLGAVTKVFKDANQM 

HPELKQRYNI I HZ TGD PHLNBLS S HL YRVDYVTDLYQPLMAMADLVVTRGGSNTLFELIiAMAKLHLI VPL 

GKEASRGDQLBNATYFEKRGYAKQLQBPDLTLHNFD^ I QS PDFFYDLLR 

45 ADZSSAIKEK 

■ 

SEQ m NO: 44 

ATGCCTAAGAAGATTTTATTTACAGGTGGTGGAACTGTAGCT 
CAAAATTTATCAAGGACGGTTGGGAAGTACATTATATTGGTGATAAAAATGGC^ 
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TGAAAAGTCAGGCCTTGACGTGACCTTTCATGCTATCGCGACATC 
CAAAATCTAGCTGATGTTTTTAAGGTTGCACTTGGCCT 
GCCCTCAAGCCCTTTTTTCCAAAGGTGGTTTTGTCTCAGTACCGCCAGT 
TAAACCAGTCTTTATTCATGAATCAGATCGCT 
5 ACTACCATGTATACCACTTTTGAGCAGGAAGACCAGTTGTCTAAAGTTAAACACCTTGGAGC 

AGGTTTTCAAAGATGCCAACCAAATGCCTGAATCAACTCAGTTAGA 

AGACCTAAAAACCCTCTTGTTTATTGGTGGTTCGGCAGGGGCGCATG^ 

CATCCAGAATTGAAGCAACGTTATAATATCATCAATATTACAG 

CTCATCTGTATCGAGTAGATTATGTTACCGATCTCTACCAACCTTTGATGGCGATGGCT 
10 " GACAAGAGGGGGCTCTAATACACTTTTTGAGCTACTGGCAATGGCTAAGCTACACCTC 

GGTAAAGAAGCTAGCCGTGGCGATCAGTTAGAAAATGCCACTTATTTTGAGAAGAGGGGC^ 
AATTACAGGAACCTGATTTAACTTTGCATAATTTTGATCAGGCAATGGCTGATT^ 
TGATTATGAGGCTACTATGTTGGCAACTAAGGAGATTCAGTCACCGGACTTCTTTT 
GCTGATATTAGCTCCGCGATTAAGGAGAAGTAA 

15 

Preferred GAS 253 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 43; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 43, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
20 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 253 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, eic.) of SEQ ID NO: 43. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 43. Other preferred fragments lack one or more amino 

■ 

acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
25 NO: 43. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(23) GAS 529 

■ 

GAS 529 corresponds to Ml GenBank accession numbers GI:13622403, GI:15675233, and " 
GL21759132, to M3 GenBank accession number GI: 21910446, to M18 GenBank accession number 
30 GI: 19746203 and is also referred to as 'Spyl280' (Ml), 4 SpyM3 J)910' (M3), 'SpyM18_1228' 

* « 

(Ml 8) and 'glmS'. GAS 529 has also been identified as a putative L-glutamine-D-fructose-6- 
phosphate aminotransferase (Glucosamine-6-phophate synthase). Amino acid and polynucleotide 
sequences of GAS 529 of an Ml strain are set forth below: 

* 

SEQ ID NO: 45 

35 MCG I VGWGNRNATD I LMQGLE KLEYRGYDS AG I FVANANQTNL I KS VGR I ADLRAK I G I DVAGSTG I GH 
TRWATHGQSTEDNAHPHTSQTGRFVLVHNGVI ENYXHIKTEFLAGHDFKGOTDTEIAVHLIGKFVBEDKL 

SVliEAFKKSLSIIEGSYAFAlM)SQATOTIWAKNKSPL^ 
ELVILTKDKVTVTDYDGKELIRDSYTAELDLSDIGXGTC 
DPAIITSIQEADRLYILAAGTSYHAGFATKNMLBQLTDT^ 
40 TADS RQ VLVKANAMG I PSLTVTNV PGSTLSREATYTMLI HAGPE I AVASTKAYTAQI AALAFLAKAVGEA 
NGKQEALDFWLVHELSLVAQSIEATLSEKDLVAEKV SY 
IQCBGFAAGELKHGTISLIEEDTPVI ALI SSSQLVASHTRGNIQEVAARGAHVLTWEEGLDRBGDDI IV 
NKVHPFLAPI AMVI PTQL I AY YAS LQRGLDVDKPRNLAKAVTVE 

* 

45 SEQ ID NO: 46 

ATGTGTGGAATTGTTGGAGTTGTTGGAAATCGCAATGCAACGGATATTT^ 
TTGAATACCGGGGTTATGATTCAGCAGGAATTTTTGTGGCTAATC 
AGTGGGGCGGATTGCTGATTTGCX5TGCCAAGATTGGCATTGATC 
ACCCGTTGGGCAACGCATGGCCAATCAACAGAGGATAATGCCCATCCTCACACGTC^ 
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■ 

TTGTACITGTTCATAATGGTGTG^ 

TTTTAAGGGGCAGACAGATACTGAGATTGCAGTACACT^ 
TCAGTACTGGAAGCTTTTAAAAAATCTTTAAGCATTATTGAAGGTTC 
GCCAAGCAACTGATACTATTTATGTGGCTAAAAACAAGTC^ 
5 CAACATGGTTTGTTCAGATGCCATGGCCATGATTCGTGAAACCAGTGAAT 
GAGCTAGTTATTTTAACCAAAGATAAGGTAACTGTTACA 

CCTACACTGCTGAATTAGACTTATCTGATATTGGCAAAGGGACTTATCCTTTCTATA 
TGATGAGCAACCAACCGTAATGCGTCAATTAATTTCAACTTATGCAGA 

GATCCGGCTATCATTACCTCTATCCAAGAGGCTGAC^GTCTTTATATTTTAGCGGCAGGGACTTC 
1 0 ATGCTGK3TTTTGCAACAAAAAATATGCTTGAGCAATTGACAGA 

TGAGTGGGGTTACCACATGCCTCTGCTTAGCAAGAAACCAATGTTTATTCTACTAAGCC 

ACCXSCAGATAGTCXSTC^GTTTTAGTAAAGGCAAATGCTATGGGCATTC 

TTCCAGGATCAACCTTATCACGTGAAGCAACATACACCATGTTGATTCATGCT^ 

TGCGTCTACAAAAGCTTACACTGCACAAATTGCTGCCCT^ 
1 5 AATGGTAAGCAAGAAGCTCTTGACTTTAACTTGGTACATGAGTTGTCATTGGTTGCCCAATCTAT^ 

CGACTTTGTCTGAAAAAGATCTCGTGGCAGAAAAGGTTCAAGCTTTGC^ 

TTACATCGGGCGTGGCAATGATTATTACXSTTGCGATGGAAG^ 

ATTCAATGCGAAGGCTTTGCGGCTGGTGAATTGAAACATGGAACCATO 

CAGTAATCCkrTTTAATATCGTCTAGTCAGTTGGTTGCCTCTCATACGCGTGGTAATATTC^ 
20 TGCCCGTGGGGCTCATGTTTTAACAGTTGTGGAAGAAGGGCTTGACCGTGAGG 

AATAAGGTTCATCCTTTCCTAGCCCXIGATTGCTATGGT 

CTITTACAACGTGGACTTGATGTTG^ 

Preferred GAS 529 proteins for use with the invention comprise an amino acid sequence: (a) having 
25 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 45; arid/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 45, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 529 proteins include variants (eg. 
allelic variants, liomologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 45. Preferred fragments 
30 of (b) comprise an. epitope from SEQ ID NO: 45. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terriunus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ED 
NO: 45. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

35 (24)GAS04S 

* 

GAS 1 1 7 corresponds to M3 GenBank accession number GL 2 1 90975 1 , Ml 8 GenBank accesion . 
number GI: 19745421 and is referred to as 4 SpyM3_0215 l (M3), c SpyM18_oppA' (M18) and *oppA\ 
GAS 045 has been identified as an oligopeptide permease. Amino acid and polynucleotide sequences 

♦ 

of GAS 045 from an Ml strain are set forth below: 
40 SEQ ID NO: 47 

VTPMKKSKWLAAVSVAILSVSAIAA CGNKNASGGSEATKTYKYVFWDPKSLD 
GTTDVITQNTWXSLLENDEYGNLVPSIAKD^ 

TABDFVTGLKHAVDDKSDALYVVEDSIKNLKAYQNGEVDPKEVGVKALDDKTVQYTLN^ 
ESYWNSKTTYSVLFPWAKFLKSKGKDFGTTDPSSILVNGAYFLSAFTSKSSMEFHKNEN 
45 YWDAKNVGIESVKLTYSIX3SDPGSFYKNPDKGEFSVARLYPNDPTYKSAKKNYADNITYG 
MLTGDI RHLTWNLNRTS F KNTKKDPAQQDAGKKALNN KD FRQAI Q FAFDRAS FQAQTAGQ 
DAKTKA1J*NMLVPPTFVTIGESDFGSEVEKEMAKLGD^ 

FAKAKEALTAEGVTF PVQLD YPVDQANAATVQEAQSF KQS VEASLGKENVI VNVLETETS 
THEAQGFYAETPBQQDYDI I SSWWGPDYQDPRTYLDIMSPVGGGSVIQKLGIKAGQNKDV 
50 VAAAGLDTYQTLLDEAAAI T DDNDARY KA YAKAQA YLTDNAVD I P WALGGT PRVTKA VP 
FSGGFSWAGSKGPLAYKGMKLQDKPVT\fi(QYEKAKEK 
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SEQ ID NO: 48 

GTGACTTTTATGAAGAAAACTAAATGGTC 
' TCCGCTTTGGCAGCTT GTGCT^ 
5 TACAAGTACGTTTTTGTTAACGATCCAAAATCAT^ 
GGAACGACTGATGTGATAACACAAATGGTTGATGGT^ 
AATTTAGTACC^TCACTTGCTAAAGATTGGAAGGTTTCAAAAGAC 
TATACTCTTCGCGATGGTGTCTCTTGGTAT^ 

ACAGCAGAAGATTTTGTGACTGGTTTGAAGCACGCGGTTGACGATAAATCA 
10 TACGTTGTTGAAGATTCAATAAAAAACTTAAAGGCTTACCAAAATG . 

AAAGAAGTTGGTGTCAAAGCCCTTGACX5ATAAAACTGTTCAGTATACTTTGAACAAG 

GAAAGCTACTGGAATTCAAAAACAACTTATAGTGTGCTTTTC 

TTGAAGTCAAAAGGTAAAGATTTTGGTACAACCGATCCATCATCAATCCTTC 

GCTTACTTCTTGAGCGCCTTCACCTCAAAATCATCTATGGAATTCCATAAAAATGAAAAC 
15 TACTGGGATGCTAAGAATGTTGGGATAGAATCTGTTAAATTGACTTACTCAGATGGTTCA 

GACCCAGGTTCGTTCTACAAGAACTTTGACJ^ 

CCAAATGACCCTACCTACAAATCAGCTAAGAAAAACTATGCTGATAACATTACTTACGGA 
ATGTTGACTGGAGATATCCGTCATTTAACATGGAA 

ACTAAGAAAGACCCTGCACAACAAGATGCCGGTAAGAAAGCTCTTAACAACAAGGATT^ ■ 
. 20 CGTCAAGCTATTCAGTTTGCTTTTGACCGAGCGTCATC^ 
GATGCCAAAACAAAAGCCTTACGTAACATGCTTGTCC 
GAAAGTGATTTTGGTTCAGAAGTTGAAAAGGAAATGGCAAAACTTTC 
GrACGTTAACTTAGCTGATGCTGAAGATGGTCT 

TTTGCAAAAGCCAAAGAAGCTTTAACAGCTGAAGGTGTAACCTTCCC^ * 
25 TACCCTGTTGACCAAGCAAACGCAGCAACTGTTCAGGAAGCC^ 
GTTCAAGCATCTCTTGGTAAAGAGAATGTCATTGTGAATGTTCT^ 
ACTCACGAAGCCCAAGGCTTCTATGCTGAGACCCCAGAACAACAAGACTACGATATCATT 
( T^TCATGGTGGGGACCAGACTATCAAGATCCACGGACCTACCTTGACATCATGAGTCCA 
' GTAGCTGGTGGATCTGTTATCCAAAAACTTGGAATCAAAGCAGGTCAAAATAAGGATC 
30 GTGGCAGCTGOVGGCCTTGATACCTACCAAACTCTTCTTGATGAAG 
GACGACAACGATGCGCGCTATAAAGCTTATOCAAAAGC^ 

GCCGTAGATATTCCAGTTGTGGCATTGGGTGGCACTCCACGAGTTACTAAAGCCGTTCCA 
TTTAGCGGGGGCTTCTCTTGGGCAGGGTCTAAAGGTCCTCTAGCATATAAAGGAA 
CTTCAAGACAAACCTGTCACAGTAAAACAATACGAAAAAGCAAAAGAAAAATGGATGA^ 
35 GCAAAGGCTAAGTCAAATGCAAAATATGCTGAGAAGTTAGCTGATCACGTTGAAAAA 

Preferred GAS 045 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 47; and/or (b) which is a fragment of at least n 
40 consecutive amino acids of SEQ ED NO: 47, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60; 70, 80, 90, 100, 150, 200 or more). These GAS 045 proteins include variants (e.g. 
allelic variants, homologs, orthologs, paralogs, mutants, e/c) of SEQ ID NO: 47. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 47. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 

■ 

45 amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 47. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
.SEQ ID NO: 47 is removed. Other fragments omit one or more domains of the protein (eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 

50 (25) GAS 095 

GAS 095 corresponds to Ml GenBank accession numbers GI: 13622787 and <31: 1 5675582, to M3 
GenBank accession number GI: 2191 1042, to M18 GenBank accession number GI: 19746634 and is 
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* 

also referred to as 'Spyl733' (Ml), 4 SpyM3J506' (M3), 'SpyM18J74r (M18). GAS 095 has also 
been identified as a putative transcription regulator. Amino acid and polynucleotide sequences of 
GAS 095 of an Ml strain are set forth below: 

SEQ ID NO: 49 

5 MKIGKKIVLMFTAIVLTTVLALGVYLTSAYTFST GELSKTFKDFSTSSNKSDAIKQTRAPSI 
SSERASKWEGNSDSMILVTVNPKTKKTTMTSLERDTLTTLSGPKNNEMNGVBAK^ 
VQDLLN I TIDNYVQINMQGLI DLVNAVGGI TVTNEFDFPI S I AENE PEYQATVAPGTHKINGEQALVYAR 
MRYDDPEGDYGRQKRQREVIQKVLKKI1ALDSISSYRKI LSAVSSNMQTNIEISSRTI PSLLGYRDALRT 
IKTYQLKGEDATLSDGGSYQIVTSNHLLEIQNRIRTELG^ 
10 SSGQAPSYSDSHSSYANYSSGVDT^SASTDQDSTASSHRPATPSSSSDALAADESSSSGSGSLVPPANI 
NPQT 

SEQ ID NO: 50 

atgaaaattggaaaaaaaatagttttaatgttcacagctattgtgttaacaactgtct 
1 5 tctatctaactagtgcttataccttctcaacaggagaattatcaaagacctttaaagatttttcgacatc 

trcaaacaaaagtgatgccattaaacaaacaagagctttttct^ 

tcttcagagcgtgcctccaagtgggaaggaaacagtgattcgatgattttggttacggtta^ 

cct^gaaaacaactatgactagtttagaacgagataccttaaccacgttatotgg^ 

aatgaatggtgttgaagctaagcttaacgctgcttatgca 
20 gtgcaagatcttttgaatatcaccattgataactatgttcaaattaatatgcaaggcctta 

tgaatgcagttggagggattacagttacaaatgagtttgattttcctatctcga 

tgaatatcaagctactgttgcgcctggaacacacaaaattaacggtgaacaagcttc 

atgcgttatgatgatcctga<3ggagattatggtcgacaaaagcgtcaacgtgaagt 

tcaaaaaaatccttgctcttgatagcattagctcttatcggaagattttatct 
25 gcaaacgaatatcgaaatctcttctcgcactatccctagtctattaggttatcgtgacgcacttagaact 

attaagacttatcaactaaaaggagaagatgccactttat 

ctaatcatttgttagaaatccaaaatcgtatccgaacagaattag 

aacaaatgctactgtttatgaaaatt^ 

tcttcaggccaggctccatcttattctgatagtcatagctcttacgctaattattci^ 
30 ccggccagagtgctagtaoigacc^ggac^^ 

atcagatgctttagcagctgatgagtctagctcatcagggtc 
aaccctcagacctaa 

Preferred GAS 095 proteins for use with the invention comprise an amino acid sequence: (a) having 
35 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 49; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 49, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 095 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID.NO: 49. Preferred fragments 
40 of (b) comprise an epitope from SEQ ID NO: 49. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9,10,1 5, 20, 25 or more) from the C-tenninus and/or one or more 
amino acids (e.g. 1 , 2, 3> 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 49. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ED NO: 49 is removed. Other fragments omit one or more domains of the protein (eg. omission 
45 of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 

♦ * 

(26) GAS 193 

GAS 193 corresponds to Ml GenBank accession numbers GI: 13623029 and GI:15675802, to M3 
GenBank accession number GI: 2191 1267, to M18 GenBank accession number GI: 19746914 and is 
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also referred to as 'Spy2025' (Ml), , SpyM3_173r (M3), 4 SpyM18_20'82' (M18) and 'isp\ GAS 193 
has also been identified as an immunogenic secreted protein precursor. Amino acid and 
* polynucleotide sequences of GAS 193 of an Ml strain are set forth below: 

SEQ ID NO: 51 

5 MKKRKLLAVTLLSTILLNSAVPLWM 

KDHKPSHTHPTPPSNDTKQTDQASSEATDKPNKDKNDTK^ 
PDQQKDQTPDKTPEKSADKTPEKGPEKATO 

RSSAAYVRHWTGDSAYTHNLLSRRYGITAEQLDGFI^SIXSIHYDKERLNGKRLLEWEKL 
AMAESSIXSTQGVAKEKGANMFGYGAFDFNPNNAKKYSDEVAIRHMVEOTIIA^ 
10 GQLDTLIDGGVYPTDTSGSGQW^IMTKLDQWIDDHGSTPEIPEHLKITSGTQFSEVPVGYKRSQPQNV 
LTYKSETYSFGQCTVTYAYNRVKEIX3YQVDRYMGNGGDWQRKPGF 
HVAWEQI KEDGS IH SESNVMGLGTI SYRTFTAEQASLLTYWGDKLPRP 

SEQ ro NO: 52 

1 5 atgaagaaaaggaaattgttagcagtaacactattaagtaccata 

ttgttgctgatacctccttgcgtaatAgcacat<^tccactgatcagcctact 
ggatgacgagagtgaaacaccaaaaaaagacaaaaaaagcaaggaaacagcgtcgcagc^ 
aaagaccataagc<^tcacacactcacccaaccccccc ; 
. catctgaagctactgacaaaccaaataaagacaaaaacgacaccaagcaaccagacagca 

20 cacccc^tctcccaaagaccagtcgtctcaaaaagagtcacaaaacaaagacggccgacc 
cctgatcagcaaaaagatcagacacctgataaaacaccagaaaaat 
gaccagaaaaagcaactgataaaacaccagagccaaatcgtgacgctccaaaacccatc 
agcagctgctcctgtctttataccttggagagaaagtgacaaagacctgagcaagctaaa^ 
cx3ctcatcagcggcttacgtgagacactggacaggtgactctgcctaca 

25 gttatgggattactgctgaacagctagatggttttttgaacagtctaggtattcactatgataaagaacg 
cttaaacggaaagcgtttattagaatgggaaaaactaa 

gcaatggcagaaagctcactaggtactcagggagttgctaaagaaaaaggagccaatatgtot 

gcgcctttgacttcaacccaaacaatgccaaaaaatacagcgatgaggrtgctat^ 

agacaccatcattgccaagaaaaaccaaacctttgaaagac^ 
30 ggccagttggataccttgattgatggtggggt^ 

cagatatcatgaccaaactagaccaatggatagatgatcatggaagcaczacctc 

caagataacttccgggacacaatttagcgaagtgcccgtaggttataaaagaagtcagc 

ttgacctacaagtcagagacctacagctttggccaatgcacttggtacgcctataatcgtgtcaaagagc 

taggttatcaagtcgacaggtacatgggtaacggtggcgactggcagcgcaagccaggttt^ 
35 ccataaacctaaagtgggctatgtcgtctcat^ 

cacgttgctgttgtagagcaaatcaaagaagatggttctatcttaatttcagagtcaaatgttatgg^ 

taggcacgatttcctatcggacgttcacagctgagcaggctagtttc 

actcccaagaccataa 

40 Preferred GAS 1 93 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99. 5% or more) to SEQ ID NO: 51; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 51, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 1 00, 1 50, 200 or more). These GAS 193 proteins include variants (e.g. 

45 allelic variants, homologs, orthologs, paralogs, mutants, etc, ) of SEQ ID NO: 5 1 . Preferred fragments . 
of (b) comprise an epitope from SEQ ID NO: 51 . Other preferred fragments lack one or more amino 

♦ . • 

acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-tenhinus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 5 1 . Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
50 of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 
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(27) GAS 137 

m 

GAS 137 corresponds to Ml GenBank accession numbers GI:13621842, GI:15674720 and 
GI:30173478, to M3 GenBank accession number GI:21909998, to M18 GenBank accession number 
GI: 19745749 and is also referred to as 'Spy0652' (Ml), l SpyM3_0462\ and 'SpyM18 0713 v (M18). 
5 Amino acid and polynucleotide sequences of GAS 1 37 of an Ml strain are set forth below: 

SEQ ID NO: S3 

msdkhinlvivtgmsgagktvaiqsfedlgyftidnmppalvpkflelieOtnenrr^ 
keinstijdsiesnpsidfrilfldatdgblvsry 
vdttkltprqlrktisdqfsegsnqasfrievmsrc 
10 edvfnyvmshpesevfykhllnlivpilpayqkegksvltvaigctggqhrsvaf 
eshrdqnrrketvnrs 

SEQ ID NO: 54 

ATGTCAGACAAACACATTAATTTAGTTATTC 
1 5 AGTCTTTTGAGGATCTAGGCTACITTACCATTGATAATATGCCC 
* ATTAATTGAACAAACCAATGAAAATCGTAGGGTGGCTTTG 

AAGGAAATTAATTCTACCTTAGATAGTATTGAAAGCAATCCTAGCATTGATTTTCGGA 

ATGCAACGGAT(3GAGAATTGGTGTCACGCTATAAAGAAACCAGACGG^ 

TCGTGTGCTTGATGGTATTCGATTGK3AAAGAGAACTCCTATCTCCTTTGAAAAG 
20 GTGGATACAACAAAATTGACCCCJTAGACAATTGCGTAAAACCATTTC^ 

ATCAAGCCTCTTTCCGTATTGAAGTGA^GAGCTTTGGGTTCAAATATGGTCTTCCTTTGG^ 

GGTTTTTGATGTGCGTTTTCTACCCAATCCTTATTATCAG 

GAGGACGTTTTTAATTATGTGATGTCTCACCCAGAATCAGAGGTGTTTTAaA 

TTGTCCCTATCTTACCGGCTTACCAAAAAGAAG^ 

25 AGGCCAACACCGCAGCGTTGCCTTTGCCC^TTC 

GT^AAGCCATCGTGATCAAAATCGTCGTAAGGAAACGGTGAATCGTTCATGA 

Preferred GAS 137 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

30 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 53; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 53, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 137 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 53. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 53. Other preferred fragments lack one or more amino 

35 acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C4errninus and/or one or more 
amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 53. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(28) GAS 084 

* 

40 GAS 084 corresponds to Ml GenBank accession numbers GI: 1 3622398 and GI: 1 5675229, to M3 
GenBank accession number GI: 21910442, to M18 GenBank accession number GI: 19746199 and is 
also referred to as 'Spyl274' (Ml), 'SpyMSJWO^ and 'SpyM18J223* (M18). GAS 084 has also 
been identified as a putative amino acid ABC transporter/periplasmic amino acid binding protein. 
Amino acid and polynucleotide sequences of GAS 084 of an Ml strain are set forth below: 

45 SEQ ID NO: 55 

M IIKKRTVAI LAI ASSFPLVA CQATKSLKSGDAWGVYQKQKSITVG FONT FVPMGYKDBSGRCKGPDIDL 

-29- 



PATENT APPLICATION 
ATTY REF NO. PP20663.002 

AKEVFHQYGLKVNFQAI NWDMKEAELNNGKI DVI WNGYS I TKERQDJCVAPTDSYMRNEQIIWKKRSDIK 
TISDMKHK\rtiGAQSASSGYDSIJjRTPKLLKDFI^ 

AKEGQLENYRMI PTTFENEAFSVGLRKEDKTLQAKINRAFRVLYQNGKFQAI SEKWFGDDVATANI KS 

5 SEQIDNO:55 

ATG ATT AT AAAAAAAAGAACCGT AG C AATTTT AGC CAT AGCT AG T AGCTTTTTCTTGGT AG C TTGTCAAG 

CTACTAAAAGTCTTAAATCAGGAGATGCTTGGGGAGTT^ 

TGACAATACGTTTGTTCCTATGGGC™ 

GCTAAAGAAGTTTTTCACCAATATGGACTCAAGGTTAACTTTCAAGCTATTAATTGGGA^ • 
1 0 CAGAACTAAACAATGGTAAAATTGATGTAATCTGGAATGGTTATT 

GGTTGC CTTTACTGATT CTT ACATGAG AAATGAAC AAATT ATTGTTGTC AAAAAAAGAT CTGAT ATT AAA 
ACAATATCAGATATGAAACATAAAGTGTTAGGAGC^CAATCAGCT^ 

GAACTCCTAAACTGCTGAAAGATTTTATTAAAAATAAAGACGCTAATCAATATGAAACCTTTACAC 
TTTTATTGATTTAAAATC^GATCGTATCGATOGAATATTGATTGACAAAGTATATC 
1 5 GCAAAAGAAGGGCAATTAGAGAATTATCGGATGATCCCAACGACCTTTGAA 
GACTTAGAAAAGAAGACAAAACGTTGCAAGCAAAAATTAATCGTGCT^ 
CAAATTTCAAGCTATTTCTGAGAAATGGTTTGGAGATGATGTTGCCACTGCCAATAT^ 

Preferred GAS 084 proteins for use with the invention comprise an amino acid sequence: (a) having 
20 50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 55; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 55, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 084 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 55. Preferred fragments 
25. of (b) comprise an epitope from SEQ ID NO: 55. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 55. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 55 is removed, ther fragments omit one or more domains of the protein (eg. omission of 
30 a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(29) GAS 384 

GAS 384 corresponds to Ml GenBank accession numbers GI: 13622908 and GI: 1 5675693, to M3 

■ 

GenBank accession number GI: 2191 1 154, to M18 GenBank accession number GI: 19746801 and is 
• also referred to as 'Spyl874' (Ml), *SpyM3_1618' (M3), and 'gpyM18J939-' (M18). GAS 384 has 
35 also been identified as a putative glycoprotein endopeptidase. Amino acid and polynucleotide 
sequences of GAS 384 of an Ml strain are set forth below: 

SEQ ro NO: 57 

M KTLAFDTSN KTLS LA I LDDBTLLADMTLN I QKKHS VSLM PAI DPLMTCTDLKPQDLER I WAKGPGS YT 
GLRVAVATAKTLAYSLNI ALVGI S SLYAIJ^TC^^ 
40 SLEVI I EQLVBEGQLI FVGETAPFABKIQKKLPQAILLPTLPSAYECGLLGQSLAPENVDAFVPQYLKRV 
EAEBNWLKDNB I KDDSHYVKRI 

• > 

SEQ ID NO: 58 

ATGAAGACACITGCATTTGATACCTGAAATAAAAC 
45 TAGCAGATATGACCCTTAACATTCAGAAAAAACATAGTGTT^ 
GACTTGTACTGATCTTAAACCTCAAGATTTAG 
GGTTTACGAGTGGCAGTTGCTACTGCAAAAACGTTAG 
CGAGTCTATATGCITrGGCTGCGTCTACTTGTA 

TGCTAGAAGGCAAAATGCGTATGTAGGTTATTATCGGCAAGGAAAATCAGTGATC 
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TCACTAGAAGTTATTATAGAACAATTAGTAGAAGAAGGACAGCTCATTTTTC 

TTGCTGAGAAAATTCAAAAGAAACTACCTCAGGCGA 

TGGTCTTTTGGGGCAAAGTTTGGCACCAGAAAATG 

GAAGCTGAAGAAAACTGGCTCAAAGATAATGAGATAAAAGATGATAGTCACTACGTT 

5 

Preferred GAS 384 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 57; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 57, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

10 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 384 proteins include variants (eg. 
allelic variants, homology orthologs, paralogs, mutants, etc.) of SEQ ID NO: 57. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 57. Other preferred fragments lack one or more amino 
acids (e.g. 1,2,3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

15 NO: 57. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

« 

(30) GAS 202 

* * m 

GAS 202 corresponds to Ml GenBank accession numbers GI: 13622431 and GI: 15675258, to M3 
GenBank accession number GI: 21910527, to M18 GenBank accession number GI: 19746290 and is 
20 also referred to as 4 Spyl309' (Ml), 'SpyM3_099r (M3), *SpyM18_132r (M18) and *dltD\ GAS 
202 has also been identified as a putative extramembranal protein. Amino acid and polynucleotide 
sequences of GAS 202 of an Ml strain are set forth below: 

SEQ ID NO: 59 

MLKRLWLILGPLLIAFVLWITIFSFPTQLDHS 
25 GSSEWSRMDSMHPSVLAERYKRSYRPFLIGKRGSASLSHYYGIQQITN^ 

PSAVQMYLSNTQVI EFLLKARTDKESQFAAKRLLELNPGVS KSNLL KKVSKG KS LS RLDRA I LKCQHQVA 
LREESLFSFLGKSTNYEKRILPRVKGLPKVFSYTOLNA 

YKNFQVNYS YLASPEYNDFQLLLSEFAKRKTDVLFVIT PVNKAWADYTGLNQDKYQAAVRKI KFQLKSQG 
. FHRIADFSKDGGESYFMQDTIHLGWNGWLAFDKKVQPFLETC 

30 

SEQ ID NQ: 60 

ATGCTTAAGAGACTCTGGTTAATTCTAGGTCCTC1TCTTATTGCCTTTC 

TTAGTTTTCCTACACAACTTGATCATTCCATAGCTC 

TTCTTTTAAAAATGGTTTGATTWIAAGACAAGCT 
3 5 GGTTCTAGCGAATGGAGTCGAATGGATAGTATGCACCCTTCGGTGC 

ATAGACCATTTTTAATTGGTAAGAGAGGATCAGCATCITK5TCGCA 

CAATGAAATGCAAAAGAAAAAAGCCATCTTTGTAGTATCTCCTCAATGGTTTACTGCT 

CGTAGTGCGGTTCAGATGTACTTGTCTAACACTCA^ 

AAGAATCACAGTTTGCAGCAAAGCGTTTGCTTGAGCTTAAC^ 
40 AAAAGTAAGTAAGGGTAAGTCTCTTAGTCGGTTAGAC^GAGCTATTTTGAAATGTC^ 

TTGAGAGAAGAGTCCCTTTTTAGT1TITTAGGCAAATCTACTAACTATGA 

TTAAGGGATTACCTAAAGTATTTTCGTATAAACAATTGAATGCAT^ 

AACAkCCAACAACCGTTTTGGGATTAAAAATACAT^ ' 
TATAAGAATTTCCAAGTTAATTATAGTTACCTGGCGTC^CCAG 
45 CAGAATTTGCTAAACGAAAAACAGATGTACTCTTTGTTATAACT 

TACCGGCTTAAATCAAGATAAGTATCAAGCGGCAG1TCGTAAAATAAAATTCCAGTT 
TTTCATCGCATTGCTGACTTCTCAAAAGATGGTGGTGAGT 
GTTGGAATGGCTGGTTAGCTTTTGATAAGAAAGTGCAACCAT^ 
CTATAAAATGAACCCITATTTTTATAGTAAAAT 

50 
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Preferred GAS 202 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

* 

97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 59; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 59, wherein n is 7 or more (ag. 8, 10, 12, 14, 16, 18, 20, 25, 

5 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 202 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 59. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 59. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

10 NO: 59. Other fragments omit one or more domains of the protein {e.g. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

* 

(31) GAS 057 

GAS 057 corresponds to Ml GenBank accession numbers GI:13621655 and GL15674549, to M3 
GenBank accession number GI: 21909834, to Ml 8 GenBank accession number Gl: 19745560 and is 
15 also referred to as *Spy0416' (Ml), 'SpyM3_0298' (M3), *SpyM18_0464' (M18) and *prtS\ GAS 
057 has also been identified as a putative cell envelope proteinase. Amino acid and polynucleotide 
sequences of GAS 057 of an Ml strain are set forth below: 

SEQ ID NO: 61 

MEKKQRFSLRKYKSGTFSVLIGSVFLVM^ 
20 TSQITLKTNREKEQSQDLVSE PTTTEIADTDAASMANTGSDATQKSASLP 

KGQGKVVAVIDTGIDPAHQSMRI SDVSTAK^CSKEDMIJU^QKAAGINYGSWINDKWPAHNY^ENSDNI K 
ENQ FED FDEDWEN FE FDAEAE PKAI KKHKI YRPQSTQAPKETVI KTEETDGSHDI DWTQTDDDTKYESHG 
MHVTGIVAGNSKEAAATGERFIXnAPEAQVTIFTCRV^ 

NGAQLSGS KPLMEAI EKAKKAGVS VWAAGNERVYGSDHDDPLATNPDYGLVGS PSTGRTPTSVAAI NS K 
25 WVIQRLMTVKBLENRADLNHGKAI YSESVDFKDIKDSLGYDKSHQFAYVKESTDAGYNAQDVKGKIALI E 

RDPNKTYDEMI ALAKKHGALGVLI FNNKPGQSNRSMRLTANGMGI PSAFI SHEFGKAMSQLNGNGTGSLE 

FDSWSKAPSQKGNEMNHFSNWGLTSDGYLKPDITAPGGDI YSTYNDNHYGSQTGTSMASPQI AGASLLV 

KQYLEKTQPNLPKEKIADIVICNLIjMSNAQIHVNPETKTTTS PRQQGAGLLN I DGAVTSGLYVTGKDNYGS 

ISLGNITDTMTFDVTVHNLSNKDKTLRYOT 
30 VTMDVSQFTKELTKQM PNGYYLEGFVRFRDSQDDQLNRVNI PFVGFKGQFENLAVAEES I YRLKSQGKTG 

FYFDBSGPKDDIYVGKHFTGLVTUSSETNVSTKTISDN^^ SPN 

GDNNQDFAAFKG^LRKYQGLKASVTHASDKEHKNPLWSPESFKGDKNra 

SLTGAELPDGHYHYWSYYPDWGAKRQEMTFDMIU)RQKPVXSQATFD^ 

DSVTYLERKDNKPYTVTINDSYKYVSV^DN^ 
35 GDHLPQTLGKT P I KL KLTDGNYQTKE TLKDNLEMTQ SDTGLVTNQAQLAVVHRNQ PQSQLT KMNQDF F I S 

PNEDGNKDFVAFKGLKNNVTfNDLTVlfVryAKDDHQKQTP I W SQAGAS VSAI ESTAWYGI TARGS KVMPGD 

YQYVVTYRDEHGKEHQKQYT I SVNDKKPMITQGRFDTI NGVDH FTPDKTKALDSSGI VREEVFYLAKKNG 

RKFDVTEGKDGI TVSDNKVYI PKNPDGS YTISKRDGVTLSDYYYLVEDRAGNVS FATLRDLKAVGKDKAV 

VWGLDLPVPEDKQIVNFTYLVlU)AIX3KPIENLEYyiW 
40 SFTLSADNNFQQVTFKITMLATSQITAHFDHL^ 

EVWSLPKGYRIEGNTKVOTLPNEVHELSLRLVKVGDASDSTGD 

AKA LPSTGEKMGLKLRIVGLVLLGLTCVFSRKKSTKD 

SEQ ID NO: 62 

45 GTGGAGAAAAAGCAACGTTTTTCCCTTAGAAAATACAAATCAGGAACGTT^ 

TTTTCTTGGTGATGACAACAACAGTAGC^ GCAGATGAGCTAAGCA 

TCACGCTCAACAACAAGCGCAACATCTCACCAATACAGAGTTG^ 

ACATCACAAATCACTCTCAAGACAAATCGTGAAAAAGAGCAATCACAAGATCT 

CAACTGAGCTAGCTGACACAGATGCAGCATCAATGGCTAATACAG^ 
50 TTCTTTACCGCCAGTCAATACAGATGTTCACGATTGGGTAAAAACCA 
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AAAGGACAAGGCAAGGTTGTCGCAGTTATTGACACAGGGATCGATCCGGCCC^ 
GTGATGTATCAACTGCTAAAGTAAAATCAAAAGAAGACATGCTAGCACG^ 
TTATXXGAGTTGGATAAATGATAAAGTTGTTTTTGtt 
GAAAATCAATTCGAGGATTTTGATGAGGACTGGGAAAACTTT^ 

5 CCATCAAAAAACACAAGATCTATCGTCCCCAATCAACCCAGGCACCX3AAAGAAACTC 
AGAAACAGATGGTTCACATGATATTGACTGGACACAAACAGACGATGACACCAAAT^ 
ATGCATGTGACAGGTATTGTAGCCGGTAATAGCAAAGAAGCCGCTGCT 
TTGCACCAGAGGCCCAAGTCATGTTCATGCGTGTTTTTGCCAACGACATCAT 
CTTTATCAAAGCTATCGAAGATGCCGTGGCTTO 
1 0 AATGGGGCACAGCTTAGTGGCAGCAAGCCTCTAATGGAAGCAATTGAAAAA^ 

CAGTTGTTGTAGCAGCAGGAAATGAGCGCGTCTATGGATCTGACCATGATGATCCATTGG 
AGACTATGGTTTGGTCGGTTCTCCCTCAAC^^ 

TGGGTGATTCAACGTCTAATGACGGTCAAAGAATTAGAAAACCGTGCCGATTTAAACCAT 

TCTATTCAGAGTCTGTCGACTTTAAAGACATAAAAGATAGCCTAGGTTATGAT 
1 5 TTATGTCAAAGAGTCAACTGATGCGGGTTATAACGCACAAGACGTTAAAGGTAAAATTC 

CGTGATCCCAATAAAACCTATGACGAAATGATTGCTTTGGCTAAGAAACATGGAG 
* TTTTTAATAACAAGCCTGGTCAATCAAACCGCTCAATGCGTCT 

TGCTTTCATATCGCACGAATTTGGTAAGGCCATGT^ 

TTTGACAGTGTGGTCTC^AAAGC^CCGAGTCAAAAAGGCAATG 
20 TAACTTCTGATGGCTATTTAAAACCTGACATTACTGCACCAGGTGGCGATATCTATTCT 

TAACCACTATGGTAGCCAAACAGGAACAAGTATGGCCTCTC 

AAACAATACCTAGAAAAGACTCAGCCAAACTTGCCAAAAGAAAAAATTGCT^ 

TGATGAGCAATGCTCAAATTCATGTTAATCCAGAGACAAAAACGAC^ 

AGGATTACTTAATATTGACXK5AGCTGTC^CTAGCGGCCTTTATGTGACAGGAAAAGA 
25 ATATCATTAGGCAACATCACAGATACGATGACGTTTGATGTC 

AAACATTACGTTATGACACAGAATTGCTAACAGATCATGTAGACCCACAAAAGGGCCGC * 

TTCTCACTCCTTAAAAACGTACCAAGGAGGAGAAGTTAC^GTCCCAGCCAATGGAAAAG 

GTTACCATGGATGTCTCACAGTTCACAAAAGAGCTAACAAAACAGATGCCAAATGGT^ 

GTTTTGTCCGCTTTAGAGATAGTCAAGATGACCAACTAAATAGAGTAAACATTCCTT^ 
30 AGGGCAATTTGAAAACTTAGCAGTTGCAGAAGAGTCCATTTACA^ 

TTTTACTTTGATGAATCAGGTC CAAAAGACGATATCTATGTCGGTAAACACTTTACAGGACTTGTCACTC 

TTGGTTCAGAGACCAATGTGTCAACCAAAACGATTTCTGACAATGGTCT 

AAATGCAGATGGCAAATTTATCTTAGAAAAAAATG£CCAAGGAAACCCT 

GGTGACAACAACCAAGATTTTGCAGCCTTCAAAGGTGTTTTCTT^ 
35 GTGTCTACCATGCTAGTGACAAGGAACACAAAAATCC 

TAAAAACTTTAATAGTGACATTAGATTTGCAAAATCAACGACCCTGTT^ 

TCGTTAACAGGAGCTGAATTACCAGATGGGCATrATCATTATGTGGTGTCrrTA 

GTCCCAAACGTCAAGAAATGACATTTGACATGATTTTAC^ 

ATTTGATCCTGAAACAAACCGATTCAAACGAGAACCCCTA 
40 GACAGTGTCTTTTATCTAGAAAGAAAAGACAACAAGCCTTATACAGTTAC^ 

ATGTCTCAGTAGAAGACAATAAAAttTTTGTGGAGCGACAAGCTGATGGCA 

TAAAGCAAAATTAGGGGATTTCTATTACATXKSTCGAGGATTTTGCAGGGAA 

GGAGATCACTTACCACAAACATTAGGTAAAACACCAATTAAACTTAAGCTTACAGAC^ 

CCAAAGAAACGCTTAAAGATAATCTTGAAATGACACAGTCTC 
45 GCTAGCAGTGGTGCACCGCAATCAGCCGCAAAGCCAGCTAACAAAGATGAATCAGGATT^ 

CCAAACX^GATGGGAATAAAGACTTTGTGGCCTTTAAAGGCTTGAAAAATAAC 

CXWTTAACGTATACGCTAAAGATGACCACCAAAAACAAACCCCTATCTGGTCTAGTCAA 
. TXTTATCCGCTATTGAAAGTACAGCCTGGTATG^ 

TATCAGTATGTTGTGACCTATCGTGACGAACATGGTAAAGAACATCAAAAGCAGTACACCATATCTGT^ 
50 ATGACAAAAAACCAATGATCACTCAGGGACGTTTTGATACCATTAATGGCX3 
CAAGACAAAAGCCCTTGACTCATCAGGCTVTTGT^ 

CGTAAATTTGATGTGACAGAAGGTAAAGATGGTATCACAG 
ATCCAGATGGTTCTTACACCATTTCAAAAAGAGATGGTGTCACACTC^ 
AGATAGAGCTGGTAATGTGTCTTTTGCTACCTT^ 
. 55 GTCAATTTTGGATTAGACTTACCGGTCCCTGAAGACAA 

ATGCAGATGGTAAACCGATTGAAAACCTAGAGTATTATAATAACTCAGGTAACAGTC^ 
CGGCAAATACACGGTCGAATTGTTGACCTATGACACCAATGCAGCCAAACTAGAGTCA 
TCCTITACCTTGTCAGCTGATAACAACTTCCAACAAGTT^ • 
AAATAACTGCCCACTTTGATCATCTTTrGCCAGAAGGCAGTCGCGW 
60 GCTAATCCCGCTTGAACAGTCCTTCTATGTGCCTAAAGC^ 

GAAGTTGTTGTCAGCCTGCCTAAAGGCTACCGTATCGAAGGCAACACAAAGGTGAA 
AAGTGCACGAACTATCATTACGCCTTGTCAAAGTAGGAGATGCCT 
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TATGTCAAAAAATAATTCACAGGCTTTGACAGCCTCTGCCACACCAA 
GCAAAAGCCCTACCATCAACGGGTGAAAAAATGGGTCTCAAGTTGCGCATAGTAGGTC^ 

GACTTACTTGCGTCTTTAGCCGAAAAAAATCAACCAAAGATTGA 

5 Preferred GAS 057 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 61; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 61 , wherein n is 7 or more (e.g. 8, 10, 1 2, 14, 1 6, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, 200 or more). These GAS 057 proteins include variants (eg. 

10 allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 61. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 61 . Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 61 . For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 

15 . SEQ ID NO: 61 is removed. In another example, the underlined amino acid sequence at the C- 
terminus of SEQ ID NO: 61 is removed. Other fragments omit one or more domains of the protein 
(e.g. omission of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an 
extracellular domain). 

The immunogenicity of other known GAS antigens may be improved by combination with two or 
20 more GAS the first antigen group. Such other known GAS antigens include a second antigen group 
consisting of (1) one or more variants of the M surface protein or fragments thereof, (2) fibronectin- 
binding protein, (3) streptococcal heme-associated protein, or (4) SagA. These antigens are referred 

* * 

to herein as the "second antigen group". 

The invention thus includes an immunogenic composition comprising a combination of GAS 
25 antigens, said combination consisting of two to thirty-one GAS antigens of the first antigen group and 
one, two, three, or four GAS antigens of the second antigen group. Preferably, the combination 
consists of three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. 
Still more preferably, the combination consists of three, four or five GAS antigens from the first 
antigen group. Preferably, the combination of GAS antigens includes either or both of GAS 40 and 
30 GAS 117. Preferably, the combination of GAS antigens includes one or more variants of the M 
surface protein. 

Each of the GAS antigens of the second antigen group are described in more detail below. 

■ 

( 1) M surface protein 

Over 100 different type variants of the M protein have been identified. Epitopes having increased 
35 bactericidal activity and having decreased likelihood of cross-reacting with human tissues have been 
identified in the amino terminal region and combined into fusion proteins containing approximately 
six, seven, or eight M protein fragments linked in tandem. See Ref. 4, 5, 6, WO 02/09485 1 and WO 
94/06465. (Each of the M protein variants, fragments and fusion proteins described in these 
references are specifically incorporated herein by reference.) 
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Accordingly, the compositions of the invention may further comprise a GAS M surface protein or a 
fragment or derivative thereof. One or more GAS M surface protein fragments may be combined 
together in a fusion protein. Alternatively, one or more GAS M surface protein fragments are 
combined with a GAS antigen or fragment thereof of the first antigen group. One example of a GAS 
5 M protein is set forth below. 

SEQ ID NO: 63 

MAKNNTNRHYSLRKLKTGTASVAVALTVIX5AGFANQTE VKANGDGN PREVI EDLAANN P AI QN I RLRYEN 
KDLKARLENAMEVAGFU3FKRAEELBKAKQALEDQRKDLET 
KEALBIJaDQASRDYHRATALEKELEEKKKALEIA 
10 KLELDQLSS EKEQLTIE KAKLEEEKQI SDASRQSLRRDLDASREAKKQVEKDLANLTAELDKVKEDKQI S 
DASRQGLRRDLDASRBAKKQVBKDLANLT^ 

N S KLAALEKLNKELE E S KKLTE KEKAE LQ AKLE AE AKALKEQLAKQ AEE LAKLRAGKAS DS QT PDT KPGN 
KAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGETANPFFTAAALTVMATAGVAAVVKRKEEN 

15 Preferred GAS M proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 63; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 63, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS M proteins include variants (eg. allelic 

20 variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 63. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 63. Preferably, the fragment is one of those described in the 
references above. Preferably, the fragment is constructed in a fusion protein with one or more 
additional M protein fragments. Other preferred fragments lack one or more amino acids (e.g. 1, 2, 3, 
4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino acids (e.g. 1, 2, 

25 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-tenninus of SEQ ID NO: 63. Other fragments 
omit one or more domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, 
of a transmembrane domain, or of an extracellular domain). 

« 

(2) Fibronectin-binding protein 

■ 

GAS fibronectin-binding protein ('Sfbl') is a mutli functional bacterial protein thought to mediate 
30 attachment of the bacteria to host cells, facilitate bacterial internalization into cells and to bind to the 
Fc fragment of human IgG, thus interfering with Fc-receptor mediated phagocytosis and antibody- 
dependent cell cytotoxicity. Immunization of mice with Sfbl and an 'HI 2 fragment* (encoded by 
positions 1240 - 1854 of the Sfbl gene) are discussed in Refe. 7,8 and 9. One example of an amino 
. acid sequence for GAS Sfbl is show below. 

35 SEQ m NO: 64 

MS FDGFFLHHLTNBLKENLLYGRI QKVNQPFERELVLTI RNHRKNYKLLLSAHPVFGRVQITQADFQNPQ 
VPNTFTMIMRKYLQGAVI EQLEQI DNDRI IBI KVSNKNEIGDAIQATLI I EIMGKHSN I ILVDRAENKI I 
ESIKHVGFSQNSYRTILPGSTYIEPPKTAAVNPFTITDV 

AELLTTDKLKRFREFFARPTQANLTTASFAPVLFSDSHATFETLSDMLDHFYQDKAERDRINQQASDLIH 

40 ' RVQTEIiDKNRNKLSKQEAELLATENAELFRQKGEl^ 

PNQNAQRYFKKYQKLKEAVKHLSGLIADTKQSITYFESVDYNLSQASIDDIEDIREELYQAGFLKSRQRD 

KRHKRKKPEQYLASDGTTILMVGRNNLQNEELTFKMAIOCGELWFW 

AELAAYYS KARLSNI*VQVDMIEAKKLHKPSGAKPGFVTYTGQKTLRVTPDQAKI LSMKLS 
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Preferred Sfbl proteins for use with the invention comprise an amino acid sequence: (a) having 50% 
or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 64; and/or (b) which is a fragment of at least n 

5 consecutive amino acids of SEQ ID NO: 64, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These Sfbl proteins include variants (eg. allelic variants, 
homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 64. Preferred fragments of (b) comprise 
an epitope from SEQ ID NO: 64. Preferably, the fragment is one of those described in the references 
above. Other preferred fragments lack one or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 

10 20, 25 or more) from the C-temiinus and/or one or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 64. Other fragments omit one or more 
domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, of a 
transmembrane domain, or of an extracellular domain). 

(3) Streptococcal heme-associated protein 

* 

15 The GAS streptococcal heme-associated protein (*Shp') has been identified as a GAS cell surface 
protein. It is thought to be cotrascribed with- genes encoding homologues of an ABC transporter 
involved in iron uptake in gram-negative bacteria. The Shp protein is further described in 10. One 
example of a Shp protein is shown below: 

SEQ m NO: 65 

20 MTKWIKQLLQVIVVFMISLSTMTNLVYADKGQIYGCIIQRNYRHPI 

VYSDAMLEVSDAGKIVLTFIWSLADYSGOTQPWIQPGGTGSFQAVDYKITQKGTDTNGTTLDIMSLPTV 

NSIIRGSMFVEPMGREWFYLSASELIQKYSGNMIAQLVTET 

GAMITONKPKANSSNNKSLSDKKI LPSKMGLCT^ 

WKKRKKNDKTM 

25 

Preferred Shp proteins for use with the invention comprise an amino acid sequence: (a) having 50% or 
more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 
98%, 99%, 99.5% or more) to SEQ ID NO: 65; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 65, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

30 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These Shp proteins include variants (eg. allelic variants, 
, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 65. Preferred fragments of (b) comprise 
an epitope from SEQ ID NO: 65. Other preferred fragments lack one or more amino acids (e.g. 1, 2, 
3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino acids (e.g. 1, 
. 2, 3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the N-teiminus of SEQ ID NO: 65. Other fragments 

35 omit one or more domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, 

* 

of a transmembrane domain, or of an extracellular domain). 

• * 

(4) SagA 

Streptolysin S (SLS), also known as ' SagA' , is thought to be produced by almost all GAS colonies. 

This cytolytic toxin is responsible for the beta-hemoly sis . surrounding colonies of GAS grown on 

40 blood agar and is thought to be associated with virulence. While the full SagA peptide has not been 
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shown to be inununogenic, a fragment of amino acids 10-30 (SagA 10-30) has been used to 
produce neutralizing antibodies. See Ref. 11. The amino acid sequence of SagA 10 - 30 is shown 
below: 

* 

SEQ ED NO: 66 FSIATGSGNSQGGSGSYTPGKC 

5 Preferred SagA 10-30 proteins for use with the invention comprise an amino acid sequence: (a) 
having 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
' 95%, 96%, 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 66; and/or (b) which is a fragment of at 
least n consecutive amino acids of SEQ ID NO: 66, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 
or 20). These SagA 10-30 proteins include variants (eg. allelic variants, homologs, orthologs, 

10 paralogy mutants, etc.) of SEQ ID NO: 66. 

There is an upper limit to the number of GAS antigens which will be in the compositions of the 
invention. Preferably, the number of G AS antigens in a composition of the invention is less than 20, 
less than 19, less than 18, less than 17, less than 16, less than 15, less than 14, less than 13, less than 
12, less than 1 1, less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, 
1 5 or less than 3. Still more preferably, the number of GAS antigens in a composition of the invention is 
less than 6, less than 5, or less than 4. Still more preferably, the number of GAS antigens in a 
composition of the invention is 3. 

The GAS antigens used in the invention are preferably isolated, i.e., separate and discrete, from the 
whole organism with which the molecule is found in nature or, when the polynucleotide or 
20 polypeptide is not found in nature, is sufficiently free of other biological macromolecules so that the 

* 

polynucleotide or polypeptide can be used for its intended purpose. 
Fusion proteins 

The GAS antigens used in the invention may be present in the composition as individual separate 
polypeptides, but it is preferred that at least two (i.e. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
25 18, 19or 20) of the antigens are. expressed as a single polypeptide chain (a 'hybrid' polypeptide). 
Hybrid polypeptides offer two principal advantages: first, a polypeptide that may be unstable or 
poorly expressed on its own can be assisted by adding a suitable hybrid partner that overcomes the 
problem; second, commercial manufacture is simplified as only one expression and purification need 
be employed in order to produce two. polypeptides which are both antigenically useful. 

30 The hybrid polypeptide may comprise two or more polypeptide sequences from the first antigen 

group. Accordingly, the invention includes a composition comprising a first amino acid sequence and 
a second amino acid sequence, wherein said first and second amino acid sequences are selected from a 
GAS antigen or a fragment thereof of the first antigen group. Preferably, the first and second amino 

4 

acid sequences in the hybrid polypeptide comprise different epitopes. 

35. The hybrid polypeptide may comprise one or more polypeptide sequences from the first antigen group 

and one or more polypeptide sequences from the second antigen group. Accordingly, the invention 
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includes a composition comprising a first amino acid sequence and a second amino acid sequence, 
said first amino acid sequence selected from a GAS antigen or a fragment thereof from the first 
antigen group and said second amino acid sequence selected from a GAS antigen or a fragment 
thereof from the second antigen group. Preferably, the first and second amino acid sequences in the 
.5 hybrid polypeptide comprise different epitopes: 

Hybrids consisting of amino acid sequences from two, three, four, five, six, seven, eight, nine, or ten 
GAS antigens are preferred. In particular, hybrids consisting of amino acid sequences from two, 
three, four, or five GAS antigens are preferred. 

Different hybrid polypeptides may be mixed together in a single formulation. Within such 
10 combinations, a GAS antigen may be present in more than one hybrid polypeptide and/or as a 

non-hybrid polypeptide. It is preferred, however, that an antigen is present either as a hybrid or as a 
non-hybrid, but not as both. 

Hybrid polypeptides can be represented by the formula NH 2 -A-{-X-L-}„-B-COOH, wherein: X is an 
amino acid sequence of a GAS antigen or a fragment thereof from the first antigen group or the 
1 S . second antigen group; L is an optional linker amino acid sequence; A is an optional N-terminal amino 
acid sequence; B is an optional C-terminal amino acid sequence; and n is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
12, 13, 14 or 15. 

If a -X- moiety has a leader peptide sequence in its wild-type form, this may be included or omitted in 
the hybrid protein. In some embodiments, the leader peptides will be deleted except for that of the -X- 
20 moiety located at the N-terminus of the hybrid.protein i.e. the leader peptide of Xi will be retained, 
but the leader peptides of X 2 . . . Xn will be omitted. This is equivalent to deleting all leader peptides 
and using the leader peptide of X t as moiety -A-. 

• ■ 

i 

For each /i instances of {-X-L-}, linker amino acid sequence -L- may be present or absent. For 
instance, when n=2 the hybrid may be NH2-Xi-LrX 2 rL2-COOH, NH 2 -X r X 2 -COOH, NH 2 -X r L r X 2 - 

25 COOH, NH 2 -X r X 2 -L2-COOH, eta Linker amino acid sequence(s) -Lr will typically be short (eg. 20 
or fewer amino acids i.e 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4; 3, 2, 1). Examples 
comprise short peptide sequences which facilitate cloning, poly-glycine linkers (i.e. comprising Gly„ 
where n = 2, 3, 4, 5, 6, 7, 8, 9, 10 or more), and histidine tags (i.e. His„ where n = 3, 4, 5, 6, 7, 8, 9, 10 
or more). Other suitable linker amino acid sequences will be apparent to those skilled in the art. A 

30 . useful linker is GSGGGG, with the Gly-Ser dipeptide being formed from a BamUl restriction site, 
thus aiding cloning and manipulation, and the (Gly) 4 tetrapeptide being a typical poly-glycine linker. 

■ 

-A- is an optional N-terminal amino acid sequence. This will typically be short (e.g. 40 or fewer 
amino acids i.e. 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 

♦ • 

16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, l)..Examples include leader sequences to direct protein 

* • 

35 trafficking, or short peptide sequences which facilitate cloning or purification (e.g. histidine tags i.e. 
Hi$„ where n ■ 3, 4, 5, 6, 7, 8, 9, 10 or more). Other suitable N-terminal amino acid sequences will be. 
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apparent to those skilled in the art. If X! lacks its own N-terminus methionine, -A- is preferably an 
oligopeptide (eg. with 1, 2, 3, 4, 5, 6, 7 or 8 amino acids) which provides a N-terminus methionine. 

■ ■ 

-B- is an optional C-terminal amino acid sequence. This will typically be short (eg. 40 or fewer 
amino acids i.e 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 
5 16, 15, 14, 13, 12, 1 1, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples include sequences to direct protein 

trafficking, short peptide sequences which facilitate cloning or purification (eg. comprising histidine 
tags i.e His„ where n = 3, 4, 5, 6, 7, 8, 9, 10 or more), or sequences which enhance protein stability. 
Other suitable C-terminal amino acid sequences will be apparent to those skilled in the art. 

Most preferably, n is 2 or 3. 

10 The invention also provides nucleic acid encoding hybrid polypeptides of the invention. Furthermore, 
the invention provides nucleic acid which can hybridise to this nucleic acid, preferably under "high 
stringency" conditions (eg. 65°C in a O.lxSSC, 0.5% SDS solution). 

* 

Polypeptides of the invention can be prepared by various means (eg. recombinant expression, 
purification from cell culture, chemical synthesis, etc) and in various forms (eg. native, fusions, 
15 non-glycosylated, lipidated, etc.). They are preferably prepared in substantially pure form (i.e 
substantially free from other GAS or host cell proteins). 

Nucleic acid according to the invention can be prepared in many ways (eg. by chemical synthesis, 
from genomic or cDNA libraries, from the organism itself, etc.) and can take various forms (eg. 
single stranded, double stranded, vectors, probes, etc.). They are preferably prepared in substantially 
20 pure form (i.e substantially free from other GAS or host cell nucleic acids). 

The term "nucleic acid" includes DNA and RNA, and also their analogues, such as those containing 
modified backbones (eg. phosphorothioates, e/c), and also peptide nucleic acids (PNA), etc. The 
invention includes nucleic acid comprising sequences complementary to those described above (eg. 
for anti sense or probing purposes). 

25 The invention also provides a process for producing a polypeptide of the invention, comprising the 
step of culturing a host cell transformed with nucleic acid of the invention under conditions which 
induce polypeptide expression. 

The invention provides a process for producing a polypeptide of the invention, comprising the step of 
synthesising at least part of the polypeptide by chemical means. 

30 The invention provides a process for producing nucleic acid of the invention, comprising the step of 
amplifying nucleic acid using a primer-based amplification method (eg. ?QK). 

The invention provides a process for producing nucleic acid of the invention, comprising the step of 
synthesising at least part of the nucleic acid by chemical means. 
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Strains 

Preferred polypeptides of the invention comprise an amino acid sequence found in an Ml, M3 or Ml 8 
strain of GAS. The genomic sequence of an Ml GAS strain is reported at Ref. 12. The genomic 
♦ sequence of an M3 GAS strain is reported at Ref 13. The genomic sequence of an Ml 8 GAS strain is 
5 reported at Ref. 14. 

Where hybrid" polypeptides are used, the individual antigens within the hybrid (i.e. individual -X- 
moieties) may be from one or more strains. Where /i=2, for instance, X 2 may be from the same strain 
as Xi or from a different strain. Where n=3, the strains might be (i) X l =X 2 =X 3 (ii) X|=X 2 iK3 (iii) 
X,« 2 =X 3 (iv) XiOC 2 « 3 or (v) X^Xj^, etc 

10 Purification and Recombinant Expression 

The GAS antigens of the invention may be isolated from a Streptococcus pyogenes, or they may be 
recombinant^ produced, for instance, in a heterologous host. Preferably, the GAS antigens are 
prepared using a heterologous host. The heterologous host may be prokaryotic (e.g. a bacterium) or 
eukaryotic. It is preferably E.coli, but other suitable hosts include Bacillus subtilis, Vibrio cholerae, 

15 Salmonella typhi, Salmonella typhimurium, Neisseria lactamica, Neisseria cinerea, Mycobacteria 
(eg. M.tuberculosis), yeasts, etc. , 

Recombinant production of polypeptides is facilitated by adding a tag protein to the GAS antigen to 

■ • 

be expressed as a fusion protein comprising the tag protein and the GAS antigen. Such tag proteins 

• * ■ 

can facilitate purification, detection and stability of the expressed protein. Tag proteins suitable for 
20 use in the invention include a polyarginine tag (Arg-tag), polyhistidine tag (His-tag), FLAG-tag, 

Strep-tag, c-myc-tag, S-tag, calmodulin-binding peptide, cellulose-binding domain, SBP-tag„ chitin- 

binding domain, glutathione S-transferase-tag (GST), maltose-binding protein, transcription 
. termination anti-terminiantion factor (NusA), E. coli thioredoxin (TrxA) and protein disulfide 

isomerase I (DsbA). Preferred tag proteins include His-tag and GST. A full discussion on the use of 
25 tag proteins can be found at Ref. 1 5. 

After purification, the tag proteins may optionally be removed from the expressed fusion protein, i.e., 
by specifically tailored enzymatic treatments known in the art. Commonly used proteases include 
enterokinase, tobacco etch virus (TEV), thrombin, and factor X 8 . 

■ 

♦ * 

Immunogenic compositions and medicaments 
30 Compositions of the invention are preferably immunogenic compositions, and are more preferably 
vaccine compositions. The pH of the composition is preferably between 6 and 8, preferably about 7. 
The pH may be maintained by the use of a buffer. The composition may be sterile and/or 
pyrogen-free. The composition may be isotonic with respect to humans. 

Vaccines according to the invention may either be prophylactic (i.e to prevent infection) or 
35 therapeutic (i.e to treat infection), but will typically be prophylactic. Accordingly, the invention 

includes a method for the therapeutic or prophylactic treatment of a Streptococcus pyogenes infection 

* ♦ 
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in an animal susceptible to streptococcal infection comprising administering to said animal a 
therapeutic or prophylactic amount of the immunogenic compositions of the invention. Preferably, 
. the immunogenic composition comprises a combination of GAS antigens, said combination consisting 
of two to thirty-one GAS antigens of the first antigen group. Preferably, the combination of GAS ■ 
5 antigens consists of three, four, five, six, seven, eight, nine, or ten GAS antigens selected from the 
first antigen group. Preferably, the combination of GAS antigens consists of three, four, or five GAS 
antigens selected from the first antigen group. Preferably, the combination of GAS antigens includes 
. either or both of GAS 40 and GAS 117. 

Alternatively, the invention includes an immunogenic composition comprising a combination of GAS 
10 antigens, said combination consisting of two to thirty-one GAS antigens of the first antigen group and 
one, two, three, or four GAS antigens of the second antigen group. Preferably, the combination 
consists of three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. 
Still more preferably, the combination consists of three, four or five GAS antigens from the first 
antigen group. Preferably, the combination of GAS antigens includes either or both of GAS 40 and 

■ ■ 

1 5 GAS 117. Preferably, the combination of GAS antigens includes one or more variants of the M 
surface protein. 

The invention also provides a composition of the invention for use as a medicament. The medicament 
is preferably able to raise an immune response in a mammal (i.e. it is an immunogenic composition) 
and is more preferably a vaccine. . 

20 The invention also provides the use of the compositions of the invention in the manufacture of a 
medicament for raising an immune response in a mammal. The medicament is preferably a vaccine. 

The invention also provides for a kit comprising a first component comprising a combination of GAS 
antigens. In one embodiment, the combination of GAS antigens consists of a mixture of two to thirty- 
one GAS antigens selected from the first antigen group. Preferably, the combination consists of three, 
25 four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. Preferably, the 
combination consists of three, four, or five GAS antigens from the first antigen group. Preferably, the 
combination includes either or both of GAS 1 1 7 and GAS 040. 

In another embodiment, the kit comprises a first component comprising a combination of GAS 
antigens consisting of a mixture of two to thirty-one GAS antigens of the first antigen group and one, 
30 two, three, or four GAS antigens of the second antigen group. Preferably, the combination consists of 
three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. Still more 
preferably, the combination consists of three, four or five GAS antigens from the first antigen group. 
Preferably, the combination of GAS antigens includes either or both of GAS 40 and GAS 117. 
Preferably, the combination of GAS antigens includes one or more variants of the M surface protein. 

■ 

35 The invention also provides a delivery device pre-filled with the immunogenic compositions of the 
invention. 
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The invention also provides a method for raising an immune response in a mammal comprising the 
step of administering an effective amount of a composition of the invention. The immune response is 
preferably protective and preferably involves antibodies and/or cell-mediated immunity. The method 
may raise a booster response. 

5 The mammal is preferably a human. Where the vaccine is for prophylactic use, the human is 

* preferably a child (eg. a toddler or infant) or a teenager, where the vaccine is for therapeutic use, the 
human is preferably a teenager or an adult. A vaccine intended for children may also be administered 
to adults eg. to assess safety, dosage, immunogenicity, etc. 

These uses and methods are preferably for the prevention and/or treatment of a disease caused by 
10 Streptococcus pyogenes (eg. pharyngitis (such as streptococcal sore throat), scarlet fever, impetigo, 
erysipelas, cellulitis, septicemia, toxic shock syndrome, necrotizing fasciitis (flesh eating disease) and 
sequelae (such as rheumatic fever and acute glomerulonephritis)). The compositions may also be 
effective against other streptococcal bacteria. 

One way of checking efficacy of therapeutic treatment involves monitoring GAS infection after 
1 5 administration of the composition of the invention. One way of checking efficacy of prophylactic 

treatment involves monitoring immune responses against the GAS antigehs in the compositions of the 
invention after administration of the composition. 

Compositions of the invention will generally be administered directly to a patient. Direct delivery may 
be accomplished by parenteral injection (e.g. subcutaneously, intraperitoneally, intravenously, 
20 intramuscularly, or to the interstitial space of a tissue), or by rectal, oral (eg. tablet, spray), vaginal, 
topical, transdermal {eg. see ref. 16} or transcutaneous {eg. see refs. 17 & 18}, intranasal {eg. see 
ref.. 19}, ocular, aural, pulmonary or other mucosal administration. 

The invention may be used to elicit systemic and/or mucosal immunity. 

Dosage treatment can be a single dose schedule or a multiple dose schedule. Multiple doses may be 
25 used in a primary immunisation schedule and/or in a booster immunisation schedule. In a multiple 
dose schedule the various doses may be given by the same or different routes eg. a parenteral prime 
and mucosal boost, a mucosal prime and parenteral boost, etc. 

The compositions of the invention may be prepared in various forms. For example, the compositions 
may be prepared as injectables, either as liquid solutions or suspensions. Solid forms suitable for 

30 solution in, or suspension in, liquid vehicles prior to injection can also be prepared (eg. a lyophilised 
composition). The composition may be prepared for topical administration eg. as an ointment, cream 
or powder. The composition may be prepared for oral administration eg. as a tablet or capsule, as a 
spray, or as a symp (optionally flavoured). The composition may be prepared for pulmonary 
administration eg. as an inhaler, using a fine powder or a spray. The composition may be prepared as 

35 a suppository or pessary. The composition may be prepared for nasal, aural or ocular administration 
eg. as drops. The composition may be in kit form, designed such that a combined composition is 
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. reconstituted just prior to administration to a patient. Such kits may comprise one or more antigens in 
liquid form and one or more lyopmlised antigens. 

Immunogenic compositions used as vaccines comprise an immunologically effective amount of 
antigen(s), as well as any other components, as needed. By "immunologically effective amount', it is 

5 meant that the administration of that amount to an individual, either in a single dose or as part of a 
series, is effective for treatment or prevention. This amount varies depending upon the health and 
physical condition of the individual to be treated, age, the taxonomic group of individual to be treated 
(eg. non-human primate, primate, etc.), the capacity of the individual's immune system to synthesise 
antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's 

10 assessment of the medical situation, and other relevant factors. It is expected that the amount will fall 

♦ * 

in a relatively broad range that can be determined through routine trials. 
Further components of the composition 

The composition of the invention will typically, in addition to the components mentioned above, 
comprise one or more 'pharmaceutical^ acceptable carriers', which include any carrier that does not 

1 S itself induce the production of antibodies harmful to the individual receiving the composition. 
Suitable carriers are typically large, slowly metabolised macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, 
and lipid aggregates (such as oil droplets or liposomes). Such earners are well known to those of 
ordinary skill in the art. The vaccines may also contain diluents, such as water, saline, glycerol, etc, , 

20 Additionally, auxiliary Substances, such as wetting or emulsifying agents, pH buffering substances, 
and the like, may be present. A thorough discussion of pharmaceutically acceptable excipients is 
available in reference 20. 

Vaccines of the invention may be administered in conjunction with other immunoregulatory agents. In 
particular, compositions will usually include an adjuvant. 

25 Preferred further adjuvants include, but are not limited to, one or more of the following set forth 
* below: 

A. Mineral Containing Compositions 

Mineral containing compositions suitable for use as adjuvants in the invention include mineral salts, 
such as aluminium salts and calcium salts. The invention includes mineral salts such as hydroxides 
30 {e.g. oxyhydroxides), phosphates (eg. hydroxyphoshpates, orthophosphates), sulphates, etc. {eg. see 
chapters 8 & 9 of ref. 21 }), or mixtures of different mineral compounds, with the compounds taking 

■ 

any suitable form (eg. gel, crystalline, amorphous, etc), and with adsorption being preferred. The 
mineral containing compositions may also be formulated as a particle of metal salt. See ref. 22. 

B. Oil-Emulsions 

35 Oil-emulsion compositions suitable for use as adjuvants in the invention include squalene-water 
emulsions, such as MF59 (5% Squalene, 0.5% Tween 80, and 0.5% Span 85, formulated into 
submicron particles using a raicrofluidizer). See ref. 23. 

» 

■ 
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Complete Freund's adjuvant (CFA) and incomplete Freund's adjuvant (IFA) may also be used as 
adjuvants in the invention. 

C. Saponin Formulations 

Saponin formulations, may also be used as adjuvants in the invention. Saponins are a heterologous 
5 group of sterol glycosides and triterpenoid glycosides that are found in the bark, leaves, stems, roots 
and even flowers of a wide range of plant species. Saponin from the bark of the Quillaia saponaria 
Molina tree have been widely studied as adjuvants. Saponin can also be commercially obtained from 
Smilax ornata (sarsaprilla), Gypsophilla.paniculata (brides veil), and Saponaria qfficianalis (soap 
root). Saponin adjuvant formulations include purified formulations, such as QS21, as well as lipid 
10 formulations, such as ISCOMs. 

Saponin compositions have been purified using High Performance Thin Layer Chromatography (HP- 
LC) and Reversed Phase High Performance Liquid Chromatography (RP-HPLC). Specific purified 
fractions using these techniques have been identified, including QS7, QS17, QS18, QS21, QH-A, QH- 
B and QH-C. Preferably, the saponin is QS21. A method of production of QS21 is disclosed in U.S. 
1 5 Patent No. 5,057,540. Saponin formulations may also comprise a sterol, such as cholesterol (see WO 
96/33739). 

Combinations of saponins and cholesterols can be used to form unique particles called 
Immunostimulating Complexs (ISCOMs). ISCOMs typically also include a phospholipid such as 
phosphatidylethanolamine or phosphatidylcholine. Any known saponin can be used in ISCOMs. 
20 Preferably, the ISCOM includes one or more ofQuil A, QHA and QHC. ISCOMs are further 
described in EPO 109 942, WO 96/1 1711 and WO 96/33739. Optionally, the ISCOMS may be 

* 

devoid of additional detergent. See ref. 24. 

A. review of the development of saponin based adjuvants can be found at ref. 25. 

C. Virosomes and Virus Like Particles fVLPs) f 

25 Virosomes and Virus Like Particles (VLPs) can also be used as adjuvants in the invention. These 

structures generally contain one or more proteins from a virus optionally combined or formulated with 
a phospholipid: They are generally non-pathogenic, hon-replicating and generally do not contain any * 
of the native viral genome. Hie viral proteins may be recombinantly produced or isolated from whole 
viruses. These viral proteins suitable for use in virosomes or VLPs include proteins derived from 

30 influenza virus (such as HA or N A), Hepatitis B virus (such as core or capsid proteins), Hepatitis E 
virus, measles virus, Sindbis virus, Rotavirus, Foot-and-Mouth Disease virus, Retrovirus, Norwalk 
virus, human Papilloma virus, HIV, RNA-phages, QB-phage (such as coat proteins), GA-phage, fr- 
.phage, AP205 phage, and Ty (such as retrotransposon Ty protein pi). VLPs are discussed further in 
WO 03/024480, WO 03/024481, and Refe. 26, 27, 28 and 29. Virosomes are discussed further in, for 

35 example, Ref. 30 

D. Bacterial or Microbial Derivatives 

. Adjuvants suitable for use in the invention include bacterial or microbial derivatives such as: 
(1) Non-toxic derivatives of enterobacterial lipopolysaccharide (LPS) 
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Such derivatives include Monophosphoryi lipid A (MPL) and 3-O-deacylated MPL (3dMPL). 

• 

3dMPL is a mixture of 3 De-O-acylated monophosphoryi lipid A with 4, 5 or 6 acylated chains. A 
preferred "small particle" form of 3 De-O-acylated monophosphoryi lipid A is disclosed in EP 0 689 
454. Such "small particles" of 3dMPL are small enough to be sterile filtered through a 0.22 micron . 
5 membrane (see EP 0 689 454). Other non-toxic LPS derivatives include monophosphoryi lipid A 
mimics, such as aminoalkyl glucosaminide phosphate derivatives eg. RC-529. See Ref. 3 1 . 

(2) Lipid A Derivatives 

Lipid A derivatives include deri vatives of lipid A from Escherichia coli such as OM- 1 74. OM- 1 74 is 
described for example in Ref. 32 and 33. 

10 (3) Immunostimulatory oligonucleotides 

Immunostimulatory oligonucleotides suitable for use as adjuvants in the invention include nucleotide 
sequences containing a CpG motif (a sequence containing an unmethylated cytosine followed by 
guanosine and linked by a phosphate bond). Bacterial double stranded RNA or oligonucleotides 
containing palindromic or poly(dG) sequences have also been shown to be immunostimulatory. 

1 5 The CpG's can include nucleotide modifications/analogs such as phosphorothioate modifications and 
can be double-stranded or single-stranded.. Optionally, the guanosine may be replaced with an analog . 
such as 2'-deoxy-7-deazaguanosine. See ref. 34, WO 02/26757 and WO 99/62923 for examples of 
possible analog substitutions. The adjuvant effect of CpG oligonucleotides is further discussed in 
Refs. 35, 36, WO 98/40100, U.S. Patent No. 6,207,646, U.S. Patent No. 6,239,1 16, and U.S. Patent 

20 No. 6,429,199. 

• 

The CpG sequence may be directed to TLR9, such as the motif GTCGTT or TTCGTT. See ref. 37. 
The CpG sequence may be specific for inducing a Thl immune response, such as a CpG- A ODN, or it 
may be more specific for inducing a B cell response, such a CpG-B ODN. CpG-A and CpG-B ODNs 
are discussed in refs. 38, 39 and WO 01/95935. Preferably, the CpG is a CpG-A ODN. 

25 Preferably, the CpG oligonucleotide is constructed so that the 5* end is accessible for receptor 

recognition. Optionally, two CpG oligonucleotide sequences may be attached at their 3 ' ends to form 
"immunomers". See, for example, refs. 40, 41, 42 and WO 03/035836. 

(4) ADP-ribosylating toxins and detoxified derivatives thereof. 

Bacterial ADP-ribosylating toxins and detoxified derivatives thereof may be used as adjuvants in the 
30 invention. Preferably, the protein is derived from E. coli (i.e., E. coli heat labile enterotoxin "LT), 
cholera ("CT"), or pertussis ("PT"). The use of detoxified ADP-ribosylating toxins as mucosal . 
adjuvants is described in WO 95/1721 1 and as parenteral adjuvants in WO 98/42375. Preferably, the 
adjuvant is a detoxified LT mutant such as LT-K63. 

* 

E. Human Immunomodulators 

35 Human immunomodulators suitable for use as adjuvants in the invention include cytokines, such as 
interleukins (eg. IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-rl2 f e/c), interferons (eg. interferon-^), 
macrophage colony stimulating factor, and tumor necrosis factor. 

F. Bioadhesives and Mucoadhesives 
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Bioadhesives and mucoadhesi ves may also be used as adjuvants in the invention. Suitable 
bioadhesives include esterified hyaluronic acid microspheres (Ref. 43) or mucoadhesives such as 
cross-linked derivatives of poly(acrylic acid), polyvinyl alcohol, polyvinyl pyrollidone, 
polysaccharides and carboxymethylcellulose. Chitosan and derivatives thereof may also be used as 
5 adjuvants in the invention. E.g.,ref. 44. 

G. Microparticles 

Microparticles may also be used as adjuvants in the invention. Microparticles (i.e. a particle of 
-lOOnm to ~150/im in diameter, more preferably ~200nm to ~30/im in diameter, and most preferably 
~500nm to ~10jun in diameter) formed from materials that are biodegradable and non-toxic (e.g. a 
1 0 poly(cie-hydroxy acid), a polyhydroxybutyric acid, a polyorthoester, a polyanhydride, a 

polycaprolactone, etc.), with poly(lactide-co-glycolide) are preferred, optionally treated to have a 
negatively-charged surface (eg. with SDS) or a positively-charged surface (eg. with a cationic 
detergent, such as CTAB). 

H. Liposomes 

15 Examples of liposome formulations suitable for use as adjuvants are described in U.S. Patent No. 
6,090,406, U.S. Patent No. 5,916,588, and EP 0 626 169. 

I. . Pol voxvethvlene ether, and Polvoxvethvlene Ester Formulations 

Adjuvants suitable for use in the invention include polyoxyethylene ethers and polyoxyethylene 
esters. Ref. 45. Such formulations further include polyoxyethylene sorbitan ester surfactants in 
20 combination with an octoxynol (Ref. 46) as well as polyoxyethylene alkyl ethers or ester surfactants 
in combination with at least one additional non-ionic surfactant such as an octoxynol (Ref. 47). 

Preferred polyoxyethylene ethers are selected from the following group: polyoxyethylene-9-lauryl 
ether (laureth 9), polyoxyethylene-9-steoryl ether, polyoxytheylene-8-steoryl ether, polyoxyethylene- 
4-lauryl ether, polyoxyethylene-35-lauryl ether, and polyoxyethylene-23-lauryl ether. 

• * 

• 25 J. Polvphosphazene fPCPP) 

PCPP formulations are described, for example, in Ref. 48 and 49. 
K. Muramvl peptides 

Examples of muramyl peptides suitable for use as adjuvants in the invention include N-acetyl- 
muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor- - 
30 MDP), and N-acetylmuramyl-L-alanyl-D-isoglutan^ 
hydroxyphosphoryloxy)-ethylamine MTP-PE). 

L lmidazoquinolone Compounds . 

Examples of imidazoquinolone compounds suitable for use adjuvants in the invention include 
Imiquamod and its homologies, described further in Ref. 50 and 51. 

* 

35 The invention may also comprise combinations of aspects of one or more of the adjuvants identified 
above. For example, the following adjuvant compositions may be used in the invention: 

( 1 ) a saponin and an oil-in-water emulsion (ref. 52); 
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(2) a saponin (e.g.., QS21) + a non-toxic LPS derivative (e.g., 3dMPL) (see WO 
94/00153); 

(3) a saponin (e.g.., QS21) + a non-toxic LPS derivative (e.g., 3dMPL) + a cholesterol; 

(4) a saponin (eg. QS21) + 3dMPL + IL-12 (optionally + a sterol) (Ref. 53); 
combinations of 3dMPL with, for example, QS21 and/or oil-in-water emulsions (Ref. 54); 

(5) S AF, containing 1 0% Squalane, 0.4% Tween 80, 5% pluronic-block polymer LI 2 1 , 
and thr-MDP, either microfluidized into a submicron emulsion or vortexed to generate a larger 
particle size emulsion. 

(6) Ribi™ adjuvant system (RAS), (Ribi Immunochem) containing 2% Squalene, 0.2% 
Tween 80, and one or more bacterial cell wall components from the group consisting of 
monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS), 
preferably MPL + CWS (Detox™); and 

(7) . one or more mineral salts (such as an aluminum salt) + a non-toxic derivative of LPS 
(such as 3dPML). 

Aluminium salts and MF59 are preferred adjuvants for parenteral immunisation. Mutant bacterial 
toxins are preferred mucosal adjuvants. 

The composition may include an antibiotic. 
Further antigens 

The compositions of the invention may further comprise one or more additional non-GAS antigens, 
including additional bacterial, viral or parasitic antigens. 

In one embodiment, the GAS antigen combinations of the invention are combined with one or more 
additional, non-GAS antigens suitable for use in a paediatric vaccine. For example, the GAS antigen 
combinations may be combined with one or more antigens derived from a bacteria or virus selected 
from the group consisting of N. meningitidis (including serogroup A, B, C, W135 and/or Y), 
Streptococcus pneumoniae, Bordetella pertussis, Moraxella catarrhalis, Tetanus, Diphtheria, 
Respiratory Syncytial virus ORSV'), polio, measles, mumps, rubella, and rotavirus. 

In another embodiment, the GAS antigen combinations of the invention are combined with one or 
more additional, non-GAS antigens suitable for use in a vaccine designed to protect elderly or 
immunocomprised individuals. For example, the GAS antigen combinations may be combined 
with an antigen derived from the group consisting of Enterococcus faecalis, Staphylococcus 
aureus, Staphylococcus epidermis, Pseudomonas aeruginosa, Legionella pneumophila, Listeria 
monocytogenes, influenza, and Parainfluenza virus (TIV'). . 

Where a saccharide or carbohydrate antigen is used, it is preferably conjugated to a carrier protein in 
order to enhance immunogenicity {e.g. refs. 55 to 64}. Preferred carrier proteins are bacterial toxins 
or toxoids, such as diphtheria or tetanus toxoids. The CRM !9 7 diphtheria toxoid is particularly 
preferred {65}. Other carrier polypeptides include the ^meningitidis outer membrane protein {66}, 
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synthetic peptides {67, 68}, heat shock proteins {69, 70}, pertussis proteins {71, 72}, protein D from 
H .influenzae {73}, cytokines {74}, lymphokines, hormones, growth factors, toxin A or B from 
Cdifficile {75}, iron-uptake proteins {76}, etc. Where a mixture comprises capsular saccharides from 
both serogroups A and C, it may be preferred that the ratio (w/w) of MenA saccharide:MenC 
5 saccharide is greater than 1 (eg. 2:1, 3:1, 4:1, 5:1, 10:1 or higher). Different saccharides can be 
conjugated to the same or different type of carrier protein. Any suitable conjugation reaction can be 
used, with any suitable linker where necessary. 

» 

Toxic protein antigens may be detoxified where necessary eg. detoxification of pertussis toxin by 
chemical and/or genetic means. 

1 0 Where a diphtheria antigen is included in the composition it is preferred also to include tetanus 
antigen and pertussis antigens. Similarly, where a tetanus antigen is included it is prefened also to 
include diphtheria and pertussis antigens. Similarly, where a pertussis antigen is included it is 
preferred also to include diphtheria and tetanus antigens. 

Antigens in the composition will typically be present at a concentration of at least Ipgfrnl each. In 
1 5 general, the concentration of any given antigen will be sufficient to elicit an immune response against 
that antigen. 

As an alternative to using protein antigens in the composition of the invention, nucleic acid encoding 
the antigen may be used {e.g. refs. 77 to 85}. Protein components of the compositions of the 
invention may thus be replaced by nucleic acid (preferably DNA eg. in the form of a plasmid) that 
20 encodes the protein. 

Definitions 

The term "comprising" means "including" as well as "consisting" eg. a composition "comprising" X 
may consist exclusively of X or may include something additional eg. X + Y. 

The term "about" in relation to a numerical value x means, for example, x+10%. 

25 References to a percentage sequence identity between two amino acid sequences means that, when 
aligned, that percentage of amino acids are the same in comparing the two sequences. This alignment 
and the percent homology or sequence identity can be determined using software programs known in 
the art, for example those described in section 7.7.18 of reference 86. A preferred alignment is 
determined by the Smith- Waterman homology search algorithm using an affine gap search with a gap 

30 open penalty of 1 2 and a gap extension penalty of 2, BLOSUM matrix of 62. The Smith-Waterman 
homology search algorithm is disclosed in reference 87. 

• ■ 

♦ 

The following example demonstrates one way of preparing recombinant GAS antigens of the 
invention and testing their efficacy in a murine model. 

EXAMPLE 1: Preparation of recombinant GAS antigens 
35 of the invention and Demonstration of Efficacy in Murine Model. 
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Recombinant GAS proteins corresponding to two or more of the GAS antigens of the first antigen 
group are expressed as follows. 

• * ■ 

1. Cloning of. GAS antigens for expression in B. coli 

> ■ 

5 The selected GAS antigens were cloned in such a way to obtain two different kinds of 

recombinant proteins: (1) proteins having an hexa-histidine tag at thexarboxy-terminus (Gas-His) 
and (2) proteins having the hexa-histidine tag at the carboxy-terminus and GST at the araino- 
terminus (Gst-Gas-His). Type (1) proteins were obtained by cloning in a pET21b+vector 
(available from Novagen). The type (2) proteins were obtained by cloning in a pGEX-NNH 

10 vector. This cloning strategy allowed for the GAS genomic DNA to be used to amplify the 

selected genes by PCR, to perform a single restriction enzyme digestion of the PCR products and 
to clone then simultaneously into both vectors. 

(a) Construction of pGEX-NNH expression vectors 
Two couples of complementary oligodeoxyribonucleotides are synthesised using the DNA synthesiser 

1 5 ABI394 (Perkin Elmer) and reagents from Cruachem (Glasgow, Scotland). Equimolar amounts of the 
oligo pairs (50 ng each oligo) are annealed in T4 DNA ligase buffer (New England Biqlabs) for 10 
min in a final volume of 50 jil and then left to cool slowly at room temperature. With the described 
procedure the following DNA linkers are obtained: 
gexNN linker 

20 Ndel Nhel Xmal EcoRI Ncol Sail Xhol Sad 

GATCCCATATGGCTAGCCCGGGGAATTCGTCCATGGAGTGAGTCGACTGACTCGAGTGATCGAGCTC 

GGTATACCGATCGGGCCCCTTAAGCAGGTACCTCACTCAGCTGACTGAGCTCACTAGCTCGAG 

» ■ m 

* 

NOtl 

25 CTGAGCGGCCGCATGAA 

GACTCGCCGGCGTACTTTCGA 

gexNNH linker 

Hindlll Notl/ Xhol Hexa-Histidine 
30 TCGACAAGCTTGCGGCCGCACTCGAGCATC^CCATCACCATCACTGLAT 

GTTCGAACGCCGGCGTGAGCACGTAGAGGTAGTGGTAGTGACTATCGA 

The plasmid pGEX-KG [K. L. Guan and J. E. Dixon, Anal. Biochem. 192, 262 (1991)] is digested 
with BamHI and Hindlll and 100 ng is ligated overnight at 16 °C to the linker gexNN with a molar 
35 ratio of 3: 1 linker/plasmid using 200 units of T4 DNA ligase (New england Biolabs). After 

transformation of the ligation product in E. coli DH5, a clone containing the pGEX-NN plasmid, 
having the correct linker, is selected by means of restriction enzyme analysis and DNA sequencing. 
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the new plasmid pGEX-NN is digested with Sail and Hindlll and ligated to the linker gexNNH. After 
transformation of the ligation product in £. coli DH5, a clone containing the pGEX-NNH plasmid, 
having the correct linker, is selected by means of restriction enzyme analysis and DNA sequencing. 

(b) Chromosomal DNA preparation 

5 GAS SF370 strain is grown in THY medium until OD^o is 0.6-0.8. Bacteria are then centrifuged, 
suspended in TES buffer with lyzozyme (lOmg/mlj and mutanolysine (10U/nl) and incubated 1 hr at 
37° C. Following treatment of the bacterial suspension with RNAase, Proteinase K and 1 0% 
Sarcosyl/EDTA, protein extraction with saturated phenol and phenol/chloroform is carried out. The 
resulting supernatant is precipitated with Sodium Acetate/Ethanol and the extracted DNA is pelletted 

1 0 by centrifugation, suspended in Tris buffer and kept at -20° C. 

(c) Oligonucleotide design 

Synthetic oligonucleotide primers are designed on the basis of the coding sequence of each GAS 
antigen using the sequence of Streptococcus pyogenes SF370 Ml strain. Any predicted signal peptide 

* • ■ • 

is omitted, by deducing the 5' end amplification primer sequence immediately downstream from the 

« 

15 predicted leader sequence. For most GAS antigens, the 5* tail of the primers (see Table 1, below) 
include only one restriction enzyme recognition site (Ndel, or Nhel, or Spel depending on the gene's 
own restriction pattern); the 3 1 primer tails (see Table 1 ) include a Xhol or a NotI or a Hindlll 
restriction site. 



5' tails 


3* tails 


Ndel. 


5' GTGCGTCATATG 3' . 


Xhol 


5' GCGTCTCGAG 3' 


Nhel 


5' GTGCGTGCTAGC 3* 


Nod 


5' ACTCGCTAGCGGCCGC 3* 


Spel 


5' GTGCGTACTAGT 3' 

* 


Hindm 


5" GCGTAAGCTT 3' 



Table 1. Oligonucleotide tails of the primers used to amplify genes encoding selected GAS 
20 antigens. 

As well as containing the restriction enzyme recognition sequences, the primers include nucleotides 
which hybridize to the sequence to be amplified. The number of hybridizing nucleotides depends on 
the melting temperature of the primers which can be determined as described [(Breslauer et al., Proc. 
Nat. Acad. Sci. 83, 3746-50 (1986 )]. The average melting temperature of the selected oligos is 50-55 
25 °C for the hybridizing region alone and 65-75 °C for the whole oligos. Oligos can be purchased from 
MWG-Biotech S.pA (Firenze, Italy), 
(d) PCR amplification 

■ 

The standard PCR protocol is as follows: 50 ng genomic DNA are used as template in the presence of 
0,2 mM each primer, 200 fM each dNTP, 1 ,5 mM MgCl 2 , 1 x PCR buffer minus Mg (Gibco-BRL), 
30 and 2 units of Taq DNA polymery (Platinum Taq, Gibco-BRL) in a final volume of 100 /d. Each 
sample undergoes a double-step amplification: the first 5 cycles are performed using as the 
hybridizing temperature of one of the oligos excluding the restriction enzyme tail, followed by 25 
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cycles performed according to the hybridization temperature of the whole length primers. The 
standard cycles are as follows: 
one cycle: 

denaturation : 94 °C, 2 min 

5 

5 cycles: "| 

denaturation: 94 °C, 30 seconds, hybridization: $-1 °C, 50 seconds, elongatioft: 72 °C, I min or 
2 min and 40 sec 

10 25 cycles: 

denaturation: 94 °C, 30 seconds 
hybridization: 70 °C, 50 seconds 
elongation: 72 °C, 1 min or 2 min and 40 sec 

15 72 °C, 7 min 
4°C 

• ■ 

The elongation time is 1 min for GAS antigens encoded by ORFs shorter than 2000 bp, and 2 min and 
40 seconds for ORFs longer than 2000 bp. The amplifications are performed using a Gene Amp PCR 
system 9600 (Perkin Elmer). 

20 To check the amplification results, 4 jd of each PCR product is loaded onto 1-1 .5 agarose gel and the 
size of amplified fragments compared with DNA molecular weight standards (DN A markers QI or DC, 
Roche). The PCR products are loaded on agarose gel and after electrophoresis the right size bands are 
excised from the gel. The DNA is purified from the agarose using the Gel Extraction Kit (Qiagen) 
following the instruction of the manufacturer. The final elution volume of the DNA is 50 jd TE (10 

25 mM Tris-HCl, 1 mM EDTA, pH 8). One jd of each purified DNA is loaded onto agarose gel to 
evaluate the yield. 

(e) Digestion of PCR fragments 

* 

One-two fig of purified PCR products are double digested overnight at 37 °C with the appropriate 
restriction enzymes (60 units of each enzyme) using the appropriate restriction buffer in 100 /xl final 
30 volume. The restriction enzymes and the digestion buffers are from New England Biolabs. After 
purification of the digested DNA (PCR purification Kit, Qiagen) and elution with 30 pi TE, 1 /d is 

4 

subjected to agarose gel electrophoresis to evaluate the yield in comparison to titrated molecular 
weight standards (DNA markers III or IX, Roche). 

(f) Digestion of the cloning vectors (pET21b+ and pGEX-NNH) 

35 10 fig of plasmid is double digested with 100 units of each restriction enzyme in 400 /xl reaction 

volume in the presence of appropriate buffer by overnight incubation at 37 °C. After electrophoresis 
on a 1% agarose gel, the band corresponding to the digested vector is purified from the gel using the 
Qiagen Qiaex II Gel Extraction Kit and the DNA was eluted with 50 /d TE. The DNA concentration 
is evaluated by measuring OD 26 o of the sample. 
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(g) Cloning cf the PCR products 
Seventy five ng of the appropriately digested and purified vectors and the digested and purified 
fragments corresponding to each selected GAS antigen are lighted in final volumes of 10-20 pi with a 
molar ratio of 1 : 1 fragment/vector, using 400 units T4 DNA ligase (New England Biolabs) in the 
presence of the buffer, supplied by the manufacturer. The reactions are incubated overnight at 1 6. °C. 
Transformation of £ coli BL21 (Novagen) and E coli BL21-DE3 (Novagen) electrocompetent cells is 
performed using pGEX-NNH ligations and pET21b+ ligations respectively. The transformation 
procedure is as follows: 1-2 fil the ligation reaction is mixed with 50 /d of ice cold competent cells, 
then the cells are poured in a gene pulser 0.1 cm electrode cuvette (Biorad). After pulsing the cells in 
aMicroPulser electroporator (Biorad) following the manufacturer instructions the cells are suspended 

* 

in 0.95 ml of SOC medium and incubated for 45 min at 37 °C under shaking. 100 and 900 /il of cell 
suspensions are plated on separate plates of agar LB 100 fig/ml Ampicillin and the plates are 
incubated overnight at 37 °C. The screening of the transfoimants is done by PCR: randomly chosen 
transformants are picked and suspended in 30 /il of PCR reaction mix containing the PCR buffer, the 
4 dNTPs, 1,5 mM MgCb. Taq polymerase and appropriate forward and reverse oligonucleotide 
primers that are able to hibridize upstream and downstream from the polylinker of pET2 lb+ or 
pGEX-NNH vectors. After 30 cycles of PCR, 5 p\ of the resulting products are run on agarose gel 
electrophoresis in order to select for positive clones from which the expected PCR band is obtained. 
PCR positive clones are chosen on the basis of the correct size of the PCR product, as evaluated by 
comparison with appropriate molecular weight markers (DNA markers 111 or IX, Roche). 

2. Protein expression 

PCR positive colonies are inoculated in 3 ml LB 100 fig/ml Ampicillin and grown at 37 °C overnight 
70 jd of the overnight culture is inoculated in 2 ml LB/Amp and grown at 37 °C until ODaw of the 
pET clones reached the 0,4-0,8 value or until OD^oo of the pGEX clones reached the 0,8-1 value. 
Protein expression is then induced by adding 1 mM IPTG (Isopropil /J-D thio-galacto-piranoside) to 
the mini-cultures. After 3 hours incubation at 37 °C the final OD«» is checked and the cultures are 
cooled on ice. After centrifugation of 0.5 mi culture, the cell pellet is suspended in 50 fil of protein 
Loading Sample Buffer (60 mM TRIS-HC1 pH 6.8, 5% w/v SDS, 10% v/v glycerin, 0.1% w/v 
Bromophenol Blue, 100 mM DTT) and incubated at 100 °C for 5 min. A volume of boiled sample 
corresponding to 0.1 OD<soo culture; is analysed by SDS-PAGE and Coomassie Blue staining to verify 
the presence of induced protein band. 
3. Purification of the recombinant proteins 

Single colonies are inoculated in 25 ml LB 100 /ig/ml Ampicillin and grown at 37 °C overnight. The 
overnight culture is inoculated in 500 ml LB/Amp and grown under shaking at 25 °C until OD m 0.4- 
0.7. Protein expression is then induced by adding 1 mM IPTG to the cultures. After 3.5 hours 
incubation at 25 °C the final Opaoo is checked and the cultures are cooled on ice. After centrifugation 
at 6000 rpm (JA10 rotor, Beckman), the cell pellet is processed for purification or frozen at -20° C. 
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(a) Procedure for the purification of soluble His-tagged proteins from E.coli 
(1) Transfer the pellets from -20°C to ice bath and reconstitute with 10 ml 50 mM NaHP0 4 buffer, 
300 mM NaCl, pH 8,0, pass in 40-50 ml centrifugation tubes and break the cells as per the following 
outline. 

5 (2) Break the pellets in the French Press performing three passages with in-line washing. 

(3) Centrifuge at about 30-40000 x g per 15-20 min. If possible use rotor JA 25.50 (21000 rpm, 15 
min.) or JA-20 (18000 rpm, 15 min.) . 

(4) Equilibrate the Poly-Prep columns with 1 ml Fast Flow Chelating Sepharose resin with 50 mM 
phosphate buffer, 300 mM NaCl, pH 8,0. 

10 (5) Store the centrifugation pellet at -20°C, and load the supernatant in the columns. 

(6) Collect the flow through. 

(7) Wash the columns with 10 ml (2 ml + 2 ml + 4 ml) 50 mM phosphate buffer, 300 mM NaCl, pH 
8.0. 

(8) Wash again with 10 ml 20 mM imidazole buffer, 50 mM phosphate, 300 mM NaCl, pH 8.0. 

15 (9) Elute the proteins bound to the columns with 4.5 ml (1.5 ml + 1.5 ml + 1.5 ml) 250 mM imidazole 
buffer, 50 mM phosphate, 300 mM NaCl, pH 8,0 and collect the 3 corresponding fractions of ~1 .5 ml 
each. Add to each tube 1 5 pi DTT 200 mM (final concentration 2 mM) 

(10) Measure the protein concentration of the first two fractions with the Bradford method, collect a 
• 10 jig aliquot of proteins from each sample and analyse by SDS-PAGE. (N.B.: should the sample be 

* • ■ • 

20 too diluted, load 2 1 nl + 7 \il loading buffer). 

(11) Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 

(12) For immunisation prepare 4-5 aliquots of 100 yg each in 0.5 ml in 40% glycerol. The dilution 
buffer is the above elution buffer, plus 2 mM DTT. Store the aliquots at -20°C until immunisation. 

(b) Purification of His-tagged proteins from Inclusion bodies 
25 Purifications are carried out essentially according the following protocol: 

(1) Bacteria are collected from 500 ml cultures by centrifugation. If required store bacterial pellets at 
-20°C. For extraction, resuspend each bacterial pellet in 10 ml 50 mM TRIS-HC1 buffer, pH 8,5 on 
an ice bath. 

(2) Disrupt the resuspended bacteria.with a French Press, performing two passages. 

30 (3) Centrifuge at 35000 x g for 15 min and collect the pellets. Use a Beckman rotor JA 25.50 (21000 
rpm, 15 min.) or JA-20 (18000 rpm, 15 min.). 

(4) Dissolve the centrifugation pellets with 50 mM TR1S-HC1, 1 mM TCEP {Tris(2-carboxyethyl)- 
. phosphine hydrochloride, Pierce} , 6M guanidium chloride, pH 8.5. Stir for - 10 min. with a magnetic 
bar. 

» 

35 (5) Centrifuge as described above, and collect the supernatant. ■ 

(6) Prepare an adequate number of Poly J>rep (Bio-Rad) columns containing 1 ml of Fast Flow 
Chelating Sepharose (Pharmacia) saturated with Nichel according to manufacturer recommendations.. 
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Wash the columns twice with S ml of HjO and equilibrate with 50 mM TRIS-HC1, 1 mM TCEP, 6M 
guanidinium chloride, pH 8.5. 

(7) Load the supematants from step 5 onto the columns, and wash with 5 ml of 50 mM TRIS-Hcl 
buffer, 1 mM TCEP, 6M urea, pH 8.5 
5 (8) Wash the columns with 10 ml of 20 mM imidazole, 50 mM TRIS-HC1 , 6M urea, 1 mM TCEP, 
pH 8.5. Collect and set aside the first 5 ml for possible further controls. 

(9) Elute the proteins bound to the columns with 4.5 ml of a buffer containing 250 mM imidazole, 50 
mM TRIS-HC1, 6M urea, 1 mM TCEP, pH 8.5. Add the elution buffer in three 1.5 ml aliquots, and 
collect the corresponding 3 fractions. Add to each fraction 15 \il DTT (final concentration 2 mM). 
10 (10) Measure eluted protein concentration with the Bradford method, and analyse aliquots of ca 10 

■ * 

\ig of protein by SDS-PAGE. 

(1 1) Store proteins at -20°C in 40% (v/v) glycerol, 50 mM TRIS-HC1, 2M urea^ 0.5 M arginine, 2 
mM DTT, 0.3 mM TCEP, 83.3 mM imidazole, pH 8.5. 

. (c) Procedure for the purification of GST-fusion proteins from E.coli . 

♦ ■ 

15 (1) Transfer the bacterial pellets from -20°C to an ice bath and suspend with 7,5 ml PBS, pH 7,4 to 
which a mixture of protease inhibitors (C0MPLETE™ - Boehringer Mannheim, 1 tablet every 25 ml 
of buffer) has been added. 

■ 

(2) Transfer to 40-50 ml centrifugation tubes and sonicate according to the following procedure: 

a. Position the probe at about 0,5 cm from the bottom of the tube 
20 b. Block the tube with the clamp 

c. Dip the tube in an ice bath 

d. Set the sonicator as follows: Timer -> Hold, Duty Cycle -> 55, Out Control -> 6. 

e. perform 5 cycles of 10 inqjulses at a time lapse of 1 minute (i.e. one cycle = 10 impulses + -45" 
hold; b. 10 impulses + -45" hold; c. 10 impulses + -45" hold; d. 10 impulses + t -45" hold; e. 10 

25 impulses + -45" hold). 

(3) Centrifuge at about 30-40000 x g for 15-20 min. E.g.: use rotor Beckman JA 25.50 at 21000 rpm, 
for 15 min. 

* 

(4) Store the centrifugation pellets at. -20°C, and load the supematants on the chromatography 
30 columns, as follows 

■ 

(5) Equilibrate the Poly-Prep (Bio-Rad) columns with 0,5 ml (=1 ml suspension) of Glutathione- 
Sepharose 4B resin, wash with 2 ml (1 + 1) H 2 0, and then with,10 ml (2 + 4 + 4) PBS, pH 7,4. 

(6) Load the supematants on the columns and discard the flow through. 

(7) Wash the columns with 10 ml (2 + 4 + 4) PBS, pH 7.4. 

35 (8) Elute the proteins bound to the columns with 4.5 ml of 50 mM TRIS buffer, 10 mM reduced 

glutathione, pH 8.0, adding 1 .5 ml + 1 .5 ml + 1 .5 ml and collecting the respective 3 fractions of -1 .5 . 
ml each. 
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(9) Measure the protein concentration of the first two fractions with the Bradford method, analyse a 
10 ng aliquot of proteins from each sample by SDS-PAGE. (N.B.: if the sample is too diluted load 21 
\il (+ 7 nl loading buffer). 

(10) Store the collected fractions at +4°C while waiting for the results of the SDS-PAGE analysis. 
.5 (11) For each protein destined to the immunisation prepare 4-5 aliquots of 100 ng each in 0.5 ml of 

40% glycerol. The dilution buffer is 50 mM TRIS.HC1, 2 mM DTT, pH 8.0. Store the aliquots at - 
20°C until immunisation. 

4. Murine Model of Protection from GAS Infection 
(a) Immunization protocol 

10 Groups of 10 CD1 female mice aged between 6 and 7 weeks are immunized with two or more GAS 
antigens of the invention, (20 fig of each recombinant GAS antigen), suspended in 100 fd of suitable 
solution. Each group receives 3 doses at days 0, 21 and 45. Immunization is performed through intra- 
peritoneal injection of the protein with an equal volume of Complete Freund's Adjuvant (CFA) for the 
first dose and Incomplete Freund's Adjuvant (IFA) for the following two doses. In each immunization 

1 5 scheme negative and positive control groups are used. 

For the negative control group, mice are immunized with E, coli proteins eluted from the purification 
columns following processing of total bacterial extract from a E. coli strain containing either the 
pET21b or the pGEX-NNH vector (thus expressing GST only) without any cloned GAS ORF (groups 
can be indicated as HisStop or GSTStop respectively). 

20 For the positive control groups, mice are immunized with purified GAS M cloned from either GAS 
SF370 or GAS DSM 2071 strains (groups indicated as 192SF and 192DSM respectively). 
Pooled sera from each group is collected before the first immunization and two weeks after the last 

♦ 

one. Mice are infected with GAS about a week after. 

Immunized mice are infected using a GAS strain different from that used for the cloning of the 
25 selected proteins. For example, the GAS strain can be DSM 207 1 M23 type, obtainable from the 
German Collection of Microorganisms and Cell Cultures (DSMZ). 

For infection experiments, DSM 2071 is grown at 37° C in THY broth until OD^ 0.4. Bacteria are 
pelletted by centrifugation, washed once with PBS, suspended and diluted with PBS to obtain the 
appropriate concentration of bacteria/ml and administered to mice by intraperitoneal injection. 
30 Between 50 and 100 bacteria are given to each mouse, as determined by plating aliquots of the 

■ « 

. bacterial suspension on 5 THY plates. Animals are observed daily and checked for survival. 

■ 

. 5. Analysis of Immune Sera 

(a) . Preparation of GAS total protein extracts 
Total protein extracts are prepared by incubating a bacterial culture grown to ODoo 0.4-0.5 in Tris 
35 50mM pH 6.8/mutanolysin (20 units/ml) for 2 hr at 37° C, followed by incubation for ten minutes on 
ice in 0.24 N NaOH and 0.96% p-mercaptoethanol. The extracted proteins are precipitated by 
addition of trichloroaceticacid, washed with ice-cold acetone and suspended in protein loading buffer. 
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(b) Western blot analysis 

Aliqiiots of total protein extract mixed with SDS loading buffer (lx: 60 mM TRIS-HC1 pH 6.8, 5% 
w/v SDS, 10% v/v glycerin, 0.1% Bromophenol Blue, 100 mM DTT) and boiled 5 minutes at 95' C, 

* 

were loaded on a 12.5% SDS-PAGE precast gel (Biorad). The gel is run using a SDS-PAGE running 
5 buffer containing 250 mM TRIS, 2.5 mM Glycine and 0. 1 %SDS. The gel is electroblotted onto 
nitrocellulose membrane at 200 mA for 60 minutes. The membrane is blocked for 60 minutes with 
PBS/0.05 % Tween-20 (Sigma), 10% skimmed milk powder and incubated O/N at 4 # C with 
PBS/0.05 % Tween 20, 1% skimmed milk powder, with the appropriate dilution of the sera. After 

washing twice with PBS/0.05 % Tween, the membrane is incubated for 2 hours with peroxidase- 

• < 

10 conjugated secondary anti-mouse antibody (Amersham) diluted 1:4000. The nitrocellulose is washed 
three times for 10 minutes with PBS/0.05 % Tween and once with PBS and thereafter developed by 
Opti-4CN Substrate Kit (Biorad). 

(c) Preparation of Paraformaldehyde treated GAS cultures 

A bacterial culture grown to OD^o 0.4-0.5 is washed once with PBS and concentrated four times in 
1 5 PBS/0.05 % Paraformaldehyde. Following 1 hr incubation at 37? C with shacking, the treated culture 
is kept overnight at 4° C and complete inactivation of bacteria is then controlled by plating aliquots on 
THY blood agar plates. 

(d) FACS analysis of Paraformaldehyde treated GAS coltures with mouse immune sera 
About 10 5 Paraformaldehyde inactivated bacteria are washed with 200 fi\ of PBS in a 96 wells U 

* 

20 bottom plate and centrifiiged for 10 min. at 3000g, at 4°C The supernatant is discarded and the 

bacteria are suspended in 20 /J of PBS-0. 1%BSA. Eighty nl of either pre-immune or immune mouse 
sera diluted in PBS-0. 1 %BS A are added to the bacterial suspension to a final /dilution of either 1 : 1 00, 
i:250 or 1:500, and incubated on ice for 30 min. Bacteria are washed once by adding 100 /xl of PBS- 
0.1%BSA, centrifiiged for 10 min. at 3000g, 4°C, suspended in 200 \d of PBS-0.1%BSA, centrifiiged 

25 again and suspended in 10 \il of Goat Anti-Mouse IgG, F(ab*) 2 fragment specific-R-Phycoerythrin- 
conjugated (Jackson Immunoresearch Laboratories Inc., cat.N° 115-11 6-072) in PBS-0. 1 %BS A to a 
final dilution of 1 : 100, and incubated on ice for 30 min. in the dark. Bacteria are washed once by 
adding 180 jd of PBS-0.1%BSA and centrifiiged for 10 min. at 3000g, 4°C. The supernatant is 
discarded and the bacteria were suspended in 200 fxl of PBS. Bacterial suspension is passed through a 

30 cytometric chamber of a FACS Calibur (Becton Dikinson, Mountain View, CA USA) and 10. 000 

events are acquired Data are analysed using Cell Quest Software (Becton Dikinson, Mountain View, 

* * * » 

CA USA) by drawing a morphological dot plot (using forward and side scatter parameters) on 
bacterial signals. An histogram plot is then created on FL2 intensity of fluorescence log scale 
recalling the morphological region of bacteria. 
35 It will be understood that the invention has been described by way Of example only and 

modifications may be made whilst remaining within the scope and spirit of the invention. 
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FIGURE 1 : Annotation of GAS 40 
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FIGURE 2 : Schematic of GAS40: putative surface 
exclusion protein prgA (873aa) 
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